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PREFACE 



The purpose of this document is to supplement the TOPS-10 
MONITOR INTERNALS COURSE MATERIAL prose with graphic 
illustrations and additional support documents. 

This supplement is divided into parts with the page 
numbering continuous within each part but not continuous 
across parts. The page numbering in part 1, Graphics, 
is of the form "a-b" where "a" corresponds to the 
chapter number in the course materials and "b" is the 
page number within chapter "a". This numbering scheme 
facilitates cross referencing the course materials to 
the supplement. 



TABLE OF CONTENTS 



PART 1 - GRAPHICS 

INTRODUCTION 1-1 

MAPPING 1-3 

PI SYSTEM 1-10 

MONITOR BUILDING 1-11 

MONITOR CODING CONVENTIONS 1-12 

CLOCK CYCLE OVERVIEW 2-1 

CLOCK CYCLE LEVEL "3" 2-2 

CLOCK CYCLE LEVEL 7 2-3 

TIMING CHARTS 2-6 

ACCOUNTING 2-11 

CORE MANAGEMENT 3-1 

PAGE FAULTS 3-11 

COMMAND PROCESSING OVERVIEW 4-1 

COMMAND PROCESSING DETAILS 4-2 

DELAYED COMMANDS 4-11 

JOB STATE TRANSITIONS 5-1 

SCHEDULER QUERIES 5-2 

SCHEDULER DETAILED FLOWS ' 5-3 

SWAPPER 6-1 

UUO PROCESSING OVERVIEW 7-1 

UUO PROCESSING DETAILS 7-2 

HOW TOPS- 10 DIES 7-7 

STOPCODES 7-15 

I/O MODULE ARCHITECTURE 8-1 

INIT UUO 8-7 

INBUF/OUTBUF UUO 8-13 

INPUT UUO 8-14 

OUTPUT UUO 8-16 

CLOSE UUO 8-17 

RELEASE UUO 8-19 

NOTES ON I/O UUOS 8-21 

INTERRUPT CHAIN 9-1 

MONGEN PI ASSIGNMENT 9-2 

PI ASSIGNMENT DEVICE GROUPS 9-3 

DEVICE INTERRUPT ROUTINE OVERVIEW 9-4 



TABLE OF CONTENTS 
PAGE 2 



ADVANCE BUFFER ROUTINE 9-5 

DEVICE DATA BLOCKS 9-8 

WAIT ROUTINES 9-9 

CALIN - START DEVICE ROUTINE 9-11 

SETOID - ROUTINE TO UNBLOCK A JOB 9-14 

A RACE CONDITION 9-20 

DISK RESIDENT DATA BASE 10-1 

CORE RESIDENT DISK DATA BASE 10-5 

DISK UUO I/O FLOWS 10-10 

I/O INSTRUCTION FORMAT 10-17 

DISK QUEUES 10-20 

DISK INTERRUPT LEVEL FLOWS - FILINT 10-21 

DISK POSITIONING OPTIMIZATION 10-24 

DISK TRANSFER OPTIMIZATION 10-25 

START I/O 10-26 

SET UP COMMAND LIST 10-27 



PART 2 - KL DOCUMENT 

PART 3 KL SYSTEM OPERATIONS (CHAPTER 3 HARDWARE REFERENCE MANUAL) 

PRIORITY INTERRUPTS 3.1 

CACHE MANAGEMENT 3.2 

TOPS-10 PAGING AND PROCESS TABLES 3.3 

MEMORY MANAGEMENT 3.5 

TIMING AND ACCOUNTING • 3.6 

ERROR AND DIAGNOSTIC INSTRUCTIONS 3.8 

PART 4 - SCHEDULER/SWAPPER PLM 

PART 5 - DISK I/O PROCESSING 

PART 6 - LA3S 



PART 1 



GRAPHICS 




TOPS-10 (DECsystem-10) TRAINING PROGRAM 



OPERATOR 



SYSTEM 
MANAGER 



SYSTEM 
PROGRAMMER 



ASSEMBLY 

LANGUAGE 

APPLICATIONS 

PROGRAMMER 



COBOL 
APPLICATIONS 
PROGRAMMER 



J2070-A 



USER 



M 



J2086-A 



OPERATOR 



LEOLAB 



LEOLAB 



J2072-A 



ADMINISTRATION j 



LEOLAB 



J2D74-A 



ES 



J2076-A 



ASSEMBLY 

UNGUAGE 

PROGRAMMING 



LEOLAB 



J0006-A 



COBOL 



LEC'LAB 



ADVANCED 
ALP 



J21S4-A 



APPLICATION 

PROGRAMMING 

TECHNIQUES 



MONITOR 
STRUCTURE 



LEOLAB 



JB026-A 



OECW20 

OS MS 
CONCEPTS 



SPI 





J2080-A 

MONITOR 
INTERNALS 




LEOUB 




J2Q84-A 



DATA 

BASE 

MANAGEMENT 

SYSTEM 



LEC'LAB 



ai.KJ 



l-\ 



J2080-A 



TOPS-10 Monitor Internals Length: 10 days 




This quite advanced course teaches the experienced programmer 
the interna) algorithms of the TOPS-10 operating system in detail. 
In-depth studies of the monitor dock cycle and device service 
routines receive equal emphasis. Students will study monitor MAC- 
ROS and conventions, TOPS-10's data base in great detail, and will 
leam methods for adding new commands, monitor calls (UUO), 
and device service routines to TOPS-10. Laboratory exercises 
introduce on-line examination of the data base and post-mortem 
crash analysis with the F1LDDT utility. 

The experienced Programmer who completes this course will be 
well-grounded in the monitor's major algorithms, from core man- 
agement to communications service routines. He will feel comfort- 
able finding his way through the code, and will be capable of making 
modifications to TOPS-10 to implement new features for his 
installation. 

Students: 

• System Programmers 

Will Learn to: 

• Describe the steps which must be followed in adding either a new 
command or UUO to the monitor.' 

• Describe the principles involved in adding a new device service 
routine. 

• Given a specific system state, trace the control path through the 
monitor. 

• Describe the effects of an interrupt on the monitor data base and 
on subsequent monitor behavior. 



Lecture/ Lab 



• Describe how a user disk I/O request is handled by the disk service 
routines. 

• Use F1LDDT to examine the data base of a running monitor or to __ 
post mortem a crash. 

• Efficiently find the section of TOPS-10 code that performs a 
particular function and follow its flow 

Ensuring Success: 

The flowchart illustrates the proper course sequence for every job 
classification within the TOPS-10 training program. 
In order to ensure the training success of every participant, it is 
mandatory that prospective students take all courses in the recom- 
mended sequence. For example, before taking this course, you 
should have completed TOPS-10 Monitor Structure and 
TOPS-10 Assembly Language Programming. We also recom- 
mend six months practical experience as a systems programmer 
under TOPS-10. 

Topics: 

• Monitor Coding Conventions and Cross-Reference Tools 

• Clock Routine 

• Core Management 

• Command Processor 

• Scheduler and Swapper 

. Monitor Calls and Device Service Routines 

• File Service Routine 

• Communications Processor 

• FILDDT and Introduction to Crash Analysis 



\--2_ 



USER 
VIRTUAl 
AODRESS 

SPACE 



255K 



\ USER 

\ PROCESS 

\ TABLE 



OOO - 777 



EXECUTIVE 340 - 377 



TR4? a MUUO 



pass Fin 



256 



310000 



400000 




EXECUTIVE 

VIRTUAL 

ADDRESS 

SPACE 



!I2K 



16K 



128K 



EXECUTIVE 
PROCESS 

TABLE 



I / 
i / 
/ 

W I 

* I 

\ 

\ 

\ M 

\," 

W 
I 

/\ 

; \ 

/ » 
/ * 
; t 



CH*NN£l 
LOGOUT 1H£»S 



INTESRUPT 



CHANNEL 9L0CK Ft(.L WORDS 



0TE20 
CONTROL BLOWS 



400-777 




000-337 



vzzzzmzzzm 



112 



SHtoED AREas 
iRE RESERVED 



TOPS -10 VIRTUAL AODRESS SPACE AND PROCESS TABLE LAYOUT 



MR-0750 



\-3 



USER PROCESS TABLE 



I USER PAGE 



37? 
400 

417 

420 

421 

422 

423 

424 

425 

42S 

427 

430 

431 

432 

433 

434 

435 

436 

437 

440 

477 
500 
501 
502 
503 
504 
50S 
506 
507 
510 



USER PAGE 776 

EXECUTIVE PAGE 340 

EXECUTIVE PAGE 376 



USER PAGE 1 



*~l 



USER PAGE 777 



EXECUTIVE PAGE 341 
EXECUTIVE PAGE 377 



RESERVED 



USSR ARITHMETIC OVERFLOW TRAP INSTRUCTION 



USER STACK OVERFLOW TRAP INSTRUCTION 



USER TRAP 3 TRAP INSTRUCTION 



MUUO STORED HERE 



MUUO OLD PC WORD 



MUUO PROCESS CONTEXT WORD 



RESERVED 



KERNEL NO TRAP MUUO NEW PC WORD 



KERNEL TRAP MUUO NEW PC WORD 



SUPERVISOR NO TRAP MUUO NEW PC WORD 



SUPERVISOR TRAP MUUO NEW PC WORD 



CONCEALED NO TRAP MUUO NEW PC WORD 



CONCEALED TRAP MUUO NEW PC WORO 



PUBLIC NO TRAP MUUO NEW PC WORD 



PUBLIC TRAP MUUO NEW PC WORD 



RESERVED 



PAGE FAIL WORD 



PAGE FAIL OLD PC WORO 



PAGE FAIL NEW PC WORD 



RESERVED 



USER PROCESS EXECUTION TIME 



USER MEMORY REFERENCE COUNT 



RESERVED 



EXECUTIVE PROCESS TABLE 



EIGHT CHANNEL LOGOUT AREAS 

EACH: INITIAL CHANNEL COMMAND 

1 GETS CHANNEL STATUS WORO 

2 GETS LAST UPDATED COMMANO 

3 RESERVED 



37 
40 
41 
"2 

67 
60 

63 
64 

137 

140 

I 

! 

177 ] 



RESERVED 



STANOARD PRIORITY INTERRUPT INSTRUCTIONS 



H 



1 FOUR CHANNEL BLOCK FILL WORDS 



I RESERVED 



FOUR DTE20 CONTROL BLOCKS 



200 i EXECUTIVE PAGE 400 



377 
400 



EXECUTIVE PAGE 776 



• EXECUTIVE PAGE 401 

I 

I 

I 

1 EXECUTIVE PAGE 777 



RESERVED 



EXECUTIVE ARITHMETIC OVERFLOW TRAP INSTRUCTION 



EXECUTIVE STACK OVERFLOW TRAP INSTRUCTION 



EXECUTIVE TRAP 3 TRAP INSTRUCTION 



RESERVED 



420 
421 
422 
423 
424 

507 
510 
511 
512 
513 
514 
515 

577 

600 I EXECUTIVE PAGE 



TIME BASE 



PERFORMANCE ANALYSIS COUNT 



INTERVAL COUNTER INTERRUPT INSTRUCTION 



RESERVED 



757 J EXECUTIVE PAGE 336 

760 ™~~ 



EXECUTIVE PAGE 1 



EXECUTIVE PAGE 337 



RESERVED 



777 



777 ! 



TOPS-10 PROCESS TABLE CONFIGURATION 



MR-0751 



-A 



VIRTUAL TO PHYSICAL ADDRESS TRANSLATION 



EFFECTIVE ADDRESS 



9 9 



(COMPUTED 
INDEX) 



512 
WORDS 



HARDWARE PAGE TABLE 


Q/E 


13 bits 




LOADED 




FROM 




UPMP/EPMP 




IF ENTRY NOT 




ALREADY HERE 



18 bits = 



25 6K words 

^12 nac«s<! 



13 


9 



22 bits = 



PHYSICAL ADDRESS 



4096K words 
8192 pages 



USER PAGE MAP (1IPMP) MAPPING ENTRY 



w 



PAGE ADDRESS 



18 BIT QUANTITY - 512 PER UPMP 

A = ACCESS DENIED/ PAGE FAULT OCCURS 

= 1 ACCESS ALLOWED 

P = CONCEALED PAGE (EXECUTE ONLY) 

= 1 PUBLIC PAGE 

W = WRITE PROTECTED 

= 1 WRITABLE 
S = ALLOCATED 

= 1 ALLOCATED BUT ZERO 
C = o CACHEABLE 

= 1 NOT CACHEABLE 
PAGE ADDRESS - 13 BIT PHYSICAL MEMORY PAGE NUMBER OR 
- 17 BIT SWAPPING SPACE ADDRESS 
INCLUDES P,W,S,C BITS 



\-!c 



z-x^c Pi: 



.£.13 



CPDDF MUUO C<F} 



40/ 

41/ (J5R LUUOPC placed he: 
at SYSINI time) 



LOC 420 
420/MUUO SEILM## 
421/JFCL 

422/MUUG SEPDLO## 
423/JSR TRP3PC 



LUUOPC: 

EXCH Tl, LUUOPC 
MOVEM Tl, UUO0 
JRST UUOERR# # 

; Page fault- trap 

; Ar i thme tic trap 

; Push down overflow tra*"* 

; Trap 3 Trap 



User's Page Map 



Kl sT/i£ 



Lcc NLUPMP (= 2000) 

NUPPPK = NLUPMP + 400 
Loc NUPPPM 



4Q0/PM. 



iCC -f- 



417/PM.ACC 



PM.WRT + 340,, PM.ACC + PM.vvRT + 341 \ 
PM.WRT + 342,, PM.ACC + PM.WRT + 343 



■f ^7i 



Exer. Per Process 
Mao 






420/ 
421/ 
422/ 
423/ 

**.*/ 

425/ 
426/ 
427/ 
430/ 
431/ 
432/ 
433/ 
434/ 
435/ 
435/ 
437/ 



MUUO SEILM#; 
JFCL 3AROUF- 
MUUO SSPDLO, 
JFCL 



EXP 
EXP 
EX? 
EXP 
EXP 
EX? 
EX? 
EXP 
EXP 
EX? 
EX? 
EXP 











IC. ucu 

IC.UOU 

IC.UOU 

IC . uou 

IC.UOU 
IC.UOU 
IC.UOU 

IC.UOU 






r MUUO## 

KTUUO## 

SNTUUOf; 

STUUO 

MUUO## 

CTUUO 

MUUO## 

FTUUO 



UPMUO 



SNTUUO : 
STUUO: 
CTUUO : 
PNTUUO : 
FTUUO : 



Halt 
Halt 
JRST 
JRST 
JRST 



T'JCTi P 



.UPMPT.UPMUO' ; 
. UPMP + . UP MUG ' 
.UPM? + .UPMTJO 



Page fault trap 
Arithmetic trap 
Push down overflow trao 
Trap 3 Trap 
MUUC stored here 
PC word of MUUO stored here 
Exec page fail word 
User page fail word 
Kernel No trap MUUO new PC word 
Kernel trap MUUO new PC word 
Supervisor No trap MUUO new PC word 
Supervisor trap MUUO new PC word 
Concealed No trap MUUO new PC word 
Concealed trap MUUO new PC word 
Public No trap MUUO new PC word 
_Public trap_MUUO new PC word 
r. Dispatch to" "kernel mode" trap 
handler 



Dispatch to use mode trao handler 



Come here en a MUUO call to similuate a KA, n UUO 
MUUO: 10 



l-~? 



Executive Virtual Memory 



Monitor Virtual Address Space 





6.03A 


PageO 


Absolute Locations 


1 


EPMP (CPUOI 


2 


EPMP(CPU1J 


3 


Null Job UPMP 


• 


Monitor 

Low 

Segment 


340 


UPMP 


341 


JOBDAT 


342 


Vestigai J08DAT 


343 


TEMP 






400 • 


Used to Build UPMP 


401 


Swaoping Checksum 


402 


PI Level Temporaries 


411 


SKPCPU Instruction 


412 


PAGTA8 


432 


MEMTAB 


452 

SYSS1Z 
777 


Monitor 

High 

Segment 


EVM 





7.01 




PageO 


Absolute Locations 




1 


• EPMP ICPUO) 


2 


EPMPICPU1) 


.3 


Null Job UPMP 




Monitor 






Low 






Segment 




340 


Funny Space 




367 






370 


UPMP 


371 


JOBDAT 


372 


Vestigal J080AT 


373 


TEMP 






400 


Used to Build UPMP 


401 


Swapping Checksum 


402 


PI Level Temporaries 


411 
• 412 


CDB v "■■ 


-' 






414 


PAGTA8 


434 


MEMTAB 


454 








Monitor 






High 




SYSSIZ 


Segment 










EVM 




777 







MB <JL*9 



\-8 



MONITOR ATOFSSABH TTY 



METHOD ___] 


J_ USE 


OVERHEAD 


RESTRICTIONS 


PER PROCESS 


ACCESS UPMP 
& JOBDAT 


SETTING UP 
UPMP MAPPING 
ENTRIES FOR 
EXEC MODE 
PAGING 


CURRENT JOB 


XCT 


UUO ARGUMENTS 
USER ACS 


NONE 


CURRENT JOB 


EVM 


I/O BUFFERS 


SETTING UP 
EPMP MAPPING 
ENTRIES FOR 
EXEC MODE 
PAGING 


NONE 



\-9 



Interrupt Programming 

The program can control the priority interrupt system by means of condition I/O instructions. 
The device code is 004, mnemonic PI. 7 
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Perform the functions specified by the effective conditions E as shown 8 (a 1 in a bit produces the 
indicated function, a has no effect). 
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Read the status of the priority interrupt (and several diagnostic bits) into location E as shown. 
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MONITOR AC LOCATIONS 



All sixteen monitor AC locations (that is, the TOPS-10 system's 
sixteen fast-memory locations) have names that are descriptive of 
their contents. These names remain the same throughout the monitor. 
The following is a description of these locations, also known as CPU 
registers. 

Fast-Memory Locations to 17 

OS — Contains the status word from a DDE while the monitor is 
processing I/O operations. 

IP* — Contains the pushdown stack pointer currently in use. 

2 J — Contains the job number, high-segment number, or 

controller data block address at interrupt level. 

3 R — Contains the job's relocation value. On KI or' KL 

machines, this usually points to the user page map via 
exec page 341; that is, R contains the value of 341000. 
If the job is locked in EVM, however, it contains the 
exec virtual address of the job. 

4 F — Contains the file DDB address when the monitor is 

working with I/O. This AC is usually used as a 
temporary register when the monitor is executing code in 
an area not concerned with I/O. 

5 U — Contains the unit data block address in FILSER; holds 

the line data block address in SCNSER. This AC is 
generally associated with a particular I/O device. 

6 Tl — Is an unpreserved temporary AC. 

7 T2 — Is an unpreserved temporary AC. 

10 T3 -■- Is an unpreserved temporary AC. 

11 T4 — Is an unpreserved temporary AC. 

12 M — Contains a mask, or, in UUOCON, holds, the UUO address 

and special bits. 

13 W — Usually contains the pointer to the process data block; 

is a general work register. 

14 PI — Is a preserved temporary AC. 

15 P2 — Is a preserved temporary AC. 

16 P3 — Is a preserved temporary AC. 

17 P4 — Is a preserved temporary AC; often points to the CPU 

data block. m 

Notice that two sets of general-purpose registers are provided, Tl to 
T4 and Pi to P4 . When the system programmer writes a subroutine, he 
knows he can use Tl through T4 without bothering to preserve the 
original contents,- because they should be saved by the caller. The 
system programmer should also realize that any subroutine he may call 
need not, worry about the original contents of registers Tl through T4 ; 
however, if he wants to use PI through P4 , he must take steps to save 
their data. Once this is saved, if the system programmer writing a 
subroutine uses Pi through P4 , he can feel free to call other 
subroutines and expect to return with these registers intact. 



Cot Cr. mention in Symbol Naming 



iyvzals defining numbers begin with a dot, followed by a two-lette- p-e-rb- 
SSIYS^ With a WO - letter prefix ' foU ««* ^ » *>t, and UUO 0x0^ end 



GETTA3 word arguments start with %. GETTAE masks are of the form XX%YYY- 
error cooes end with Sj and $ symbols are reserved for the installations! 



DATA BLOCK WORD AD CRESSES: 

.JB??? Job data area synods (JOSDAT.MAC) 

? lllll 5 ALLI UU0 s >"= bol s implemented after 5.03 (UUOCON.MAC) 

•«;, f ; -^fiVVS 6 process d t ta falock < ro5 >> «s«*lly indexed by W. 

'SI,, - ^tanoed arguments (LOOKUP, ENTER, RENAME) (S.MAC) 

.«-P??? vocations in CPU data Slock (CD3), 

,C0??? Locations in CPU0 CDS (COMMON, MAC) 

.CI??? Locations in CPU2 CDS (COMMON. MAC) 

.GT??? GETTAB table numbers (UUOCON.MAC) 

*f^,lll KI1 & exec P £ S e ma P page symbols (S.MAC) 
KI10 user page map page symbols (S.MAC) 
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--' - OJ - do ni --'- roller* me convention in the monitor) 

.S???? Spool bits (S.MAC) 

.TY??? D3VTYP UUO bits (S.JftC) 

*^fSl D ° ntt ty?e Grror ^ssages on error interceot 

.Uivf?? Intercept device ok errors (S.MAC) 

J3«L?? Job limit bits (S.MAC) 

S?.??? Second processor status bits (S.MAC) 

A?.??? APR CONI/C0N0 bits (S.MAC) 

PC.??? PC word flag (S.MAC) 

PI.??? PI CONI/CONO bits (S.MAC) 

I?.??? KI10 APR CONI/CONO bits (S.MAC) 

IC.??? KI10 PC word flaos (S.MAC) 

II.??? KI10 PI CONI/CONO bits (S.MAC) 

"Z'Ul fJ 1 C0 *°/?^I "t. for both KA10 and KI10 (s.MAC) 

5.*?- r o^O/CON? S f" 5 0th . *** "2 KII ° t S '««) 

-^.... u CCNO/CONI oits for both KA10 and KI10 (S.MAC) 

S TRUCTURE DUO CODES : (NOTE; These do not follow monitor convention) 

.FS??? STRUUO Function cod« 

.ER??? STRUUO error code 
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SAVE LAST 
JOB'S ACS IN 
SHADOW AREA 



SAVUAC 




RESTORE 

SOFTWARE 

STATE FOR 

NEXT JOB 



d?6Alk 



SWITCH TO j 
AC BLOCK i 
& RESTORE PC 



SET UP 
NULL JOB 
IN ACS 



SAVE SOFTWARE 

STATE 

OF 

LAST JOB 



(1) 



CI?8B 



CIP7: 



SET UP 

UBR FOR 

NEXT JOB 



SETRLl 



(2) 




| SAVE PC 
I AND SWITCH 
TO EXEC. ACS 



CK0INT 



CLOCK TICK 
OR 
PROGRAM 
BLOCKS 



(4) 



RESTORE 
NEXT JOB'S 



RESUAC 








■*-<*. 
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NOTES FOP THE CONTROL ROUTINE 



1. For VM Systems, this means saving the job's PC into 
JOBPC and the address of DDT into JOBDDT. 

For Non-VM Systems, this means saving the above 
items and copying the last user's Job Device 

nsaiynment tdwxe tiom cue tuo mm liik uud vt\Lt\ 
AREA. 

(Also See Note 3 Below) 

2. For KA Systems, this routine sets up the hardware 
relocation and protection registers. 

3. For VM Systems, this means restoring the job's PC 
into .CPPC and the address of DDT into .CPDDT 
(USRDDT) both in the CPU Data Block. 

For Non-VM Systems, this means restoring the above 
items and copying the next user's Job Device 
Assignment table from the JOB DATA AREA into the 
CPU Data Block. 

(The Reverse of Note #1) 

4. For KA Systems, this routine would save the User's 
PC and the User's AC. 



2.-J3 



CLOCK TICK 



\ 



n at "" 

CH3 








V 


CH 7 




MONITOR 
CYCLE 






*' 


USER JOB A 




JOB A 


/ 






• 



/I PR INT(KA/KI] 



INTERVAL TIMER (KL) 



SCHEDULING 



tt -ji 



CH3 



-J 



en 




iBi.aaaibw^iUi^iij 



USER J#8 AI 



SjiliU'.JKJ^-'^S.'iJ'iM.i 



CLOCK TICK DURING INTERRUPT 






CHd HR 


,r~— 






chs IHM 






" 




\ ^ * 




CH7 






IT^MDNITOR T 










>' 


USER JOBA 








JOB B 


I/O 


[NT 


a 


7 REQUEST 





APR1NT(ka/ki) 



INTERVAL TIMER (KL) 



1/0 WAIT 



CHS 



(0 

I 

■J2 



CH7 



UUO 



(partial cycle) 



USER JOB A 



HULL JOB 



(partial cycle) 



flfQUFUE 



SCH 




JOB A 



device 

INT 



UUO INTERRUPTED 



» 

o 



» ii 



CH3 



CH7 



UUO 



U51R JOBA 







trttHWft 

CYCLE 



JOB B 



APR INT (KA/KI) 
INTERVAL TIMER (KL) 



KL EBOX / MBOX TIME ACCOUNTING 





[ 



10 



MACHINE CYCLE TIME 



15 



20 



25 



30 



CASE # J , 
(light i/o) 
! 



MBOX REFERENCE COUNTS 



MBOX 
TOTAL = 2 



INSTRUCTION 
FETCH 



OPERAND 
FETCH 



I NS TRUCTION DECOD I NG 



EBOX BUSY TIME 
NSTRUCTION 



INSTRUCTION EXECUTION 



EBOX 
TOTAL = 10 



CASE # 2, 

(heavy i/o) 



CASE 1 TOTAL = 12 
TOTAL TIME = 18 



MBOX REFERENCE COUNTS 

I 



MBOX 
TOTAL = 2 



f 



INSTRUCTION 
FETCH 



OPERAND 
FETCH 



EBOX BUSY TIME 



INST RUCTION DECOD ING 

V///////A 



INSTRUCTI ON EXECUTION EB0X 
Y//^/Zyy/A TOTAL 



= 10 



CASE 2 TOTAL 



TOTAL TIME = 



= 12 
2£_ 



2-V 



TINE ACCOUNTING 



PI LEVEL 



pi im 

t 



1 












- — 




DEVICE 


2 




„.-- 


3 




... " 









f 


-*"'" 




*» 










5 










6 
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CONTROL ROUTINE 






UUO 








USER 








JOB A 





CLOCK 
TICK 



UUO PROCESSOR ! SCHED, 



JOB B 



■+— + 1 



TIME 



CLOCK 
TICK 



OPERATION 


C 


ORE COMMAND 




CORE UUO 




SWAPPER 










DISPATCH 




(COMCON) 




(UUOCON) 














1 






EPROCESSING - 




COREO 
(CORED 




CORUUO 
(CORED 




SWAP 
(SCHEDD ■ 
























ALLOCATION 


VIRCHK 
(VMSER) 




CORE1 
(CORED 




























ASSIGNMENT 




CORE1A 
(KxSER) 

— 




CORGET 
(KxSER) 



z-\ 



PAGTAB 





1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 




UPMP 



JOB 4 



UPMP 




JOB 7 




PAGPTR 



POINTS TO FIRST 
FREE LINK 



NOTE: 

1. A ZERO ENTRY INDICATES THE END 
OF THE CHAIN. 

2. ALL FREE PAGES ARE LINKED 
TOGETHER ALSO. 



3 -"2- 



JOB 

HAS CORE 



JOB 

OOESNT 
HAVE CORE 



JOB 
DE- 
j CREASING 



JOB 
INCREASING 



1 



{AMOUNT 
NOT 
AVAILABLE 



AMOUNT 
AVAILABLE 



NOT 
VIRTUAL 



VIRTUAL 
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CORE UUO 




f CHGCOR j 



CHGOR 



CHANGE CORE 
ASSIGNMENT 




SETUP 

SKIP RETURN 



IFJACCT 

ON, SET PHYSICAL 

ONLY BIT 



IOWAIT 



WAIT FOR ALL 
DEVICES TO BE 
INACTIVE 



YES 



CORBND 



DETERMINE HIGHEST 
JOB ADDRESSES 



STORE 
RESULT IN 
USEH AC 



•"( STOTAC J ' 




CORE! 



TRY TO 
ASSIGN CORE 




CORU1:: 



YES 



UCORHI 



TRY TO ASSIGN 
HIGH SEG CORE 




SETUP 

SKIP RETURN 



1. THE CORE UUO WITH ZERO ARGUMENT 
DOES NOT AFFECT THE SIZE OF USER 
CORE. RATHER IT RETURNS THE VALUES 
OF THE JOB'S HIGHEST ADDRESSES. 

2. ERRORS ARE: ACTIVE I/O OR SAVE IN 
PROGRESS; SUM OF SEGS TOO LARGE; 
PROTECTION FAILURE. 



WSCHED 



BLOCK JOB 




( RETURN J 
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CORE COMMAND 



C COREO J 




C ERROR A 
EXIT J 



CHGSWP 



CHANGE 
INCORES12E 
TO NEW SIZE 



/successful^ 




3-> 



CORE ALLOCATION 



CORE! 




YES 



YES 



VIRCHK 



PERFORM VIRTUAL 
ALLOCATION IF 
APPROPRIATE 



FROM PAGE CM-6 




DIRECT RETURN 
FROM VIRCHK 

SKIP RETURN 
FROM VIRCHK 



UUOSKIP 
RETURN 



) 



DOUBLE SKIP RETURN 
FROM VIRCHK 



■c 




RETURN 



3 



ALLOW ONE PAGE 
FOR UPMP 



UPDATE 
VIRTAL 




PERFORM 
ASSIGNMENT 



ERROR 
RETURN 



J 



SUMSEG 



COMPUTE SIZE 
OF JOB 



1. VIRCHK SATISFIES ALL REQUESTS EXCEPT: 

a. CHANGES TO SHARABLE HIGH SEGS, 
AND 

b. LOW SEGMENTS EXPANOING FROM 
ZERO SIZE. 



3-fe 



ALLOCATION AND ASSIGNMENT 




HERE IF JOB VIRTUAL 

OR HAS TO GO VIRTUAL I B 



c 



RETURN 
CM-5 (2) 



DEALLOCATE 
STORAGE 



c 



RETURN 
(UUO SKIP) 



YES 



YES 




RETURN 
UUO SKIP) 



GS1ZT 



TEST VARIOUS CORE 
LIMITS 




PHYCR2 



ALLOCATE PHYSICAL 
CORE 



1. IN THESE TWO CASES, VIRCHK LETS CORE 1 
DO THE ALLOCATION. 

2. WE TAKE THIS PATH IF: 

a. NEW SIZE < CPPL, OR 

b. CPPL < NEW SIZE < MPPL AND CPPL IS 
A GUIDELINE RATHER THAN A 
LIMIT. 

3. VMCMAX CHECKED. 



f RETURN "N 

y (uuosKiPiy 



3-7 



HERE WHEN ALLOCATION HAS 
BEEN DONE. 



f CORE1A J 



KLSER 



COMPUTE 
CHANGE 



HERE WHEN ASKING 
FOR CORE AND IT 
IS AVAILABLE 



SCPAGS 



GET PHYSICAL PAGE 
NO. OF LAST PAGE 
SEGMENT 



UPDATE JBTSWP 



ADPAGS 



ADD REQUESTED 
NUMBER OF PAGES 



f CORGT2 ) 



U) CALLS FRDCR IN SEGCON TO FREE 
DORMANT AND IDLE HIGH 



CORE ASSIGNMENT 





CORGT7 



J 



C0RE1B 




HERE WHEN 
GIVING UP CORE 



YES 



SNPAGS 



FIND PHYSICAL PAGE 
NUMBER OF FIRST 
PAGE TO RETURN 



SNPAGS 



FIND PHYSICAL PAGE 
NUMBER OF FIRST 
LOGICAL PAGE 



GVPAGS 



GIVE BACK PAGES 




AND IT IS A 
LOW SEGMENT 



CORGT2 



) 



CORGT1 



J 



IT IS A LOW SEGMENT T N 



CLEAR 

JBTUPM 

ENTRY 



GVPAGS 



GIVE 
BACK UPMP 



( CORGT6 J 
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INKLSER 



HERE WHEN DOiNG 
ASSIGNMENT 
STARTING WITH 



c 



GORGET 



J 



HERE FROM 
SWAPPER 



SVEUB 



SAVE CURRENT 



JUST BUILDS LIST IN 
PAGTAB AND RETURNS 
PHYSICAL PAGE 
NUMBER OF START 
OF CHAIN 




EXPANDING SEGMENT 



CORGT7 



3 



CORGT1 



UPMP 

ALREADY 

EXISTS 



3 



STORE STARTING 
PAGE NUMBER IN 
JBTHSA 



UPMP GOES ON 
CHAIN OF ITS OWN 



GTPAGS 



GET ONE MORE 
PAGE FOR UPMP 



THIS AND PAGTA8 
MAKES MAPPING IN 
OTHER JOB'S UPMP EASY 



UPDATE JBTUPM 
WITH ADDRESS 
OF UPMP 




HERE WHEN JOS 
HAS NO CORE AND, 
THE UPMP MUST 
BE BUILT 



INITIALIZE UPMP 
VALUES, JOBDAT 
VESTIGALJOBDAT 



MAP UPMP 
THROUGH EXEC 

340 PAGE 



STEUB 



TELL HARDWARE 
OF NEW UPMP 



STORE IN UPMP 
MAPPING FOR FIRST 
PAGE OF LOWSEG 



MAKE JOBDAT 
ADDRESSABLE 
THROUGH EXEC 
PAGE 341 



HERE WHEN 

1. JOB WITH CORE 
GET SOME MORE 

2. JOB GIVING UP 
CORE 



c 



CORGT7 



MAKE UPMP WRIT- 
ABLE AND ACCESS- 
IBLE AS EXEC 
PAGE 400 



D C 




CORGT6 



NO, JOB IS DECREASING TO 

3 



MARK JOB AS 
EXPANDING (JXPN) 
SKIP RETURN 



TELL HARDWARE OF 
CHANGE TO EPMP 



© 



c 



XPAND 



) IN 



SCHED1 



3-9 



APPLIES 
TO SWAPPER 
USE OF 
CORE 




c 



CORGT5 



USE IMGOUT 
AS SEGMENT 
SIZE 





UPDATE 
JBREL 



YES 



NOTHING TO CLEAR 



UPDATE 
JBTADR 
ENTRY 
(FROM R) 



!3> SNPAGS 



GET PHYSICAL PAGE 
NO. OF FIRST NEW 
PAGE 



f CORGT4 \ 



ZEROS 

NEW 

PAGES 



SWAPPER OR CORE 
COMMAND 



MAKE A 
PAGE 

ADDRESSABLE 
THRUP EVA 400 



CORE UUO 



SETREL 



SET UP PAGE 
MAP SLOTS (2) 



CLEAR IT 



CALLS MAPLOW TO 
SET UP UPMP 



STEP TO 
NEXT PAGE 
CHAINED THRU 
PAGTAB 



CURHGH 



(1) 



YES 




f RETURN J 



YES 




NEGATIVE SIZE IN U 



YES 



•{ RETURN J 



1. IF CHANGE IS TO CURRENT HIGH SEG, 
UPDATE CURRENT JOB'S PAGE MAP. 

2. SET UP R AND ADDRESS BREAK. 

3. JOBDAT IS ALWAYS THE FIRST PAGE IN 
THE CHAIN. PICK UP THE PHYSICAL PAGE 
NUMBER FROM RH UPMP LOCATION 400 
TO USE AS THE PAGTAB INDEX. CHAIN 
DOWN OIF HAD NO CORE OR NO. PAGES 
TO NEW ASSIGNMENT. 



GIVEN STARTING PHYSICAL PAGE 
NUMBER, FILL THE UDMP SLOTS BY 
CHAINING DOWN PAGTAB AND EX- 
TRACTING THE PHYSICAL PAGE 
NUMBERS FOR PLACEMENT IN THE 
UPMP. 

3-)0' 







USER PAGE MAP (UPMP) MAPPING ENTRY 
ONE ENTRY PER PAGE (10 BITS) 



W 



-[ — i — i — i — i — i — i — i — i — i — i i r 

PAGE ADDRESS 
■ ■ i i I I I 1 — I 1 1 1 L 



18 BIT QUANTITY - 512 PER UPMP 

A = ACCESS DENIED, PAGE FAULT OCCURS 

= 1 ACCESS ALLOWED 
P =0 CONCEALED PAGE (EXECUTE ONLY) 

= 1 PUBLIC PAGE 
W = WRITE PROTECTED 

= 1 WRITABLE 
S =0 ALLOCATED 

= 1 ALLOCATED BUT ZERO 
C =0 CACHEABLE 

= 1 NOT CACHEABLE 

PAGE ADDRESS - 13 BIT PHYSICAL MEMORY PAGE NUMBER OR 
- 17 BIT SWAPPING SPACE ADDRESS 

TO FIND THE STATUS OF ANY PARTICULAR PAGE. USE THESE GUIDELINES: 

1. IF A = 1 ANDS = 0, THE PAGE IS IN CORE AT THE ADDRESS SPECIFIED BY PAGE-ADDRESS. 

2. TF A = 0, S = 0, AND C = 0. THE PAGE DOES NOT EXIST. 

3. IF A = AND THE WSB TAB ENTRY FOR THIS PAGE - 1, THE PAGE IS IN CORE AT PAGE- 
ADDRESS. 

4. IF A = 0, AABTAB = 1 AND WSBTAB = 0, THE ENTRY CONTAINS A DISK ADDRESS. 

5. IF A = 0, S = 1. AND AABTAB = 0, THE PAGE IS ALLOCATED BUT ZERO. 



PAGE FAIL WORD 
KL10 
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 16 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 



-t — i — i — r 

FAIL TYPE 

_l ' i ' 



~1 — I — I — I — I — I — r 
VIRTUAL ADDRESS 
_J I I ,,, I I I i_ 



i — i — r 



L 



IP 



Z& 




-i — i — i — i — r 



_l I I I L. 



-I I L 



*■ 1 ■= USER VIRTUAL ADDRESS 

= EXECUTIVE VIRTUAL ADDRESS 



IF BIT 1 = 0, THEN BITS 1-7 ARE 
INTERPRETED AS FOLLOWS: 

01 02 03 04 05 06 07 






A 


W 


S 


T 


P 


C 



L 



= READONLY 

1 = WRITE 






KI10 
00 01 02 03 04 05 06 07 OB 09 10 11 12 13 14 15 16 17 IB 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 




~i — i — i — i — i — i — i — r 

VIRTUAL PAGE 
J I I. I ' i I i_ 




"i — i — i — r 

FAIL TYPE 

_i I I l_ 



IF BIT 31 = 0, BITS 31-35 ARE INTERPRETED 
AS FOLLOWS: 

31 32 33 34 35 






A 


W S T 



ARGUMENT BLOCK IN THE PFH 



OLD PC 



PAGE FAULT WORD 



VIRTUAL TIME 



PAGE RATE 



PSI VECTOR ADDRESS 



THE PAGE FAULT WORD IS ORGANIZED AS FOLLOWS: 

17 18 35 



1 



PAGE NUMBER 



FAULT TYPE 



SET IF WORKING 
SET CHANGED 



I 



THE FAULT TYPES ARE: 



1 AOFF 

2 PAGE NOT IN WORKING SET 

3 PAGE CONTAINING UUO ARGUMENT NOT IN WORKING SET 

4 TIME TRAP 

5 ALLOCATED BUT ZERO IABZ) 

6 ABZ AFTER UUO 



HERE ON PAGE FAULT 



( 


SEILM J 


" 




SET PAGE 
FAILCODE 




FAIL TYPE 25 



YES 



•T PTPPAR J 



FAIL TYPES 

36 & 37 ^ rtn \jr\ -»w yp^ 

"ARxr----^ Y ■ 

..ERROR,, 
SEL1MA:: JnO 



{ PRTRP ) 



MAKE Kl STYLE 
PAGE FAIL WORD 



TRIED TO CHANGE 
WRITE PROTECTED 
PAGE 




SEILM1 



) 



STOPCODE 
IME 



3-\\ 




CHECK PFH FOR VM USER 
DON'T RETURN IF FOUNO 



3-\S" 



UJ 



<r 



USRFL9:: 



ENTER HERE 

1 WHEN UPTMC COUNTED DOWN 
EOR CURRENT JOB TO CAUSE 
TIME INTERRUPT 
(COMES FROM CLOCK 1 @INCTM<H7) 

OR 

2 UUOCON FINDS ARGUMENT OUT 
OF CORE (UUOCHK) AND CALLS 
WOFLT WHICH ENTERS HERE TO 
GET PAGE IN CORE 




-«/elrrqr J 



D 



GET PAGE 
NUMBER AND 
MAP BITS 



USRFLX 




GO TO USER PROGRAM 



USRFLB 



3 



ESTIMATE ADDRESS TO 
STORE ARGUMENT BLOCK 



SET REASON 

= TIME INTERRUPT 




TIME 
USRFLG ) INTERRUPT 



D 



PAGE NOT IN CORE 



/'wad A 

V STOPCODE J 



WSBTAB + AABTAB 
DO NOT AGREE 



PAGE ACCESSIBLE 
BUT NOT 
IN CORE 




TURN ON 
ABIT 



CLEAR PAGE 
TABLE 



T GO TO USERJ 



■f 'USRFL6 J 



SHOULD WE\. NO 
SET A BIT ^> '( 1USRFL6 }LETPFHDOIT 



CONTINUE PROGRAM 



3-17 




PAGE NOT IN CORE 



STUFFINFO INTO PFH 
ARGUMENT BLOCK 




UFLTRT ) GO TO PFH 



C UFLTRT J GO TO PFH 



THE FOLLOWING INFORMATION IS RETURNED 
TO THE PFH: 

1. OLD PC 

2. PAGE FAULT WORD 

3. VIRTUALTIME 

4. PAGE RATE 



3-6 



VIRTUAL TIME TRAPPING 




INCL0CK1 RIGHT AFTER 
LIMIT CHECKING 



SET .UPTMC 
= 1 TICK 



SAVE OLD PC 
IN .UPTMC 



MAKE CONTEXT 
SWITCH START 
ATTIMFLT-.CPPC 



c 



CONTINUE 
MONITOR CYCLE 



) 



3-1? 



C TIMFLT J 



JOB RESTARTED IN VMSER (TIMFLT) 
TO HANDLE TIME INTERRUPT 



SAVE USER 
PC ON STACK 
(JDBPDL) 



RESET INTERNAL 
COUNTER 

.UPTMC 



-T3 

TO INDICATE 

TIME INTERRUPT 



USRFL1 
HANDLE IT 



( £XIT ) 

THROUGH PFH 

TO USER PROGRAM 



3 -ZO 



( FAULT J 



GET PAGE 
NUMBER 







PAGE NOT IN 
MEMOnY 



ALLOCATED 

OUT ZEno 



\ 




1 

ACCESS 
ALLOWED 

t 



SETA 
BIT 



C RETURN J 




YES 



MUST UE USING 
PHYSICAL GUIDE 
LINE OR AT HARD 
LIMIT 



SET UP NEXT 
TIME INT. 
FOR 1/2 SEC 



PAGE OUT PAGES 
WITH A OFF AND 
IN WORKINGS * 



' TO BRING W.S. BACK 
DOWN TO THE LIMIT 
(GUIDELINE) 



C RETURN J 



FIND CANDIDATE 
FOR PAGE OUT 



YES 





SET UP NEXT 
TIME INT FOR 
1-1/2 SEC 

I 



BUILD NEW 
FIFO LIST 



HERE ON FIRST PAGE FAULT 
( FIR5T ) 



RETURN 



£ 



SET UP TIME 
TRAP FOR 1/2 SEC 



PAGE 
OUT 



£ 



1 



NOW THAT RDDM 
HAS BEEN MADE 
GO CREATE PAGE 



THE FIFO LIST IS A LIST OF VIRTUAL 
PAGE NUMBERS FOR PAGES IN THE 
WORKING SET. THE LIST IS COMPRISED 
OF TWO SUDLISTS IN VIRTUAL PAGE 
NUMBER OROER. THE FIRST SUBLIST IS 
FOR PAGES WITH THE "A" BIT OFF, THE 
SECOND SUOLIST IS FOR PAGES WITH 
THE "A" BIT ON. CANDIDATES FOR 
PAGE OUT COME FROM THE FRONT OF THE 
LIST. THE NUMBER OF A PAGE PACED IN 
IS PLACED AT THE END OF THE LIST. 




COMMAND PROCESSOR FLOW 



COMCON 



COMMAND 



SCAN CMDMAP 

FOR COMMAND 

BIT 



COM! 



J. 



GET COMMAND 
FROM TTY 
BUFFER 
OR TTFCOM 



FIND 
COMMAND 

IN 
COMTAB 



COMDIS 



< EXECUTIVE 
COMMAND 
ROUTINE 



COMRET 



CLEAR 
COMMAND 
BIT, ETC. 




SET JNA, 
PRINT JOB#, 
ETC. 



PCRLFA 



PRINT MESG 
IF NEEDED 




DELAY 
NO COMMAND 

^ OR TYPE 

ERROR 
MESSAGE 




REQUEUE 



YES 



MARK JOB 

TO BE 
REQUEUED 



e 



RETURN TO 
CLOCK! 



4-1 



ENTER HERE FROM CLOCK! BECAUSE COMCMT^O 



C COMMAND A 



COMMAND 



FIND LINE WITH 
CMD READY 



TTYCOM 



PARTIAL CONTEXT 
SWITCH-UBR 



SVEUB 




COM1 



GET CMD FROM 
TTY BUFFER 



CTEXT 



S 



T 



F1ND CMD IN 
COMTAB & ITS 
DISPATCH 
TBL ENTRY 




SETUP AC F,U,J,W 



YES 



NO 



YES 



NO 



4-1 



GET CMD 
FROM TBL 
TTFCOM 



CAN COMMAND 
BE DONE 




; TYPES 

"LOGIN PLEASE" 



JOB # NOT NEEDED 
YES 





COMGO 



HERE TO FIND A FREE JOB NUMBER 

( NUMLOP ] 



NUMLOP 



SCAN FOR 
FREE JOB # 



JBTSTS JNA&CMWB&JRQ. 
ALL OFF 




NEWJOB 



YES 



TRY TO ATCH 
TTY TO JOB 



TTYAT1 




VF^ 



CREATE A 
PDB 



CREPDB 



CLEAR JOB 
TABLES & 
PDB 



CLRJBT 



SET UP 

WATCH TBL 

ENTRY 



SET TTY 
LOC IN 
JBTLOC 



4-3 



COMER ; PRINTS "JOB CAPACITY EXCEEDED" 




COMER }; PRINTS "JOB CAPACITY EXCEEDED" 



( CHKRUN \ 



HERE WITH A JOB NUMBER 



CHKBAT 



CHKXO 




JOB HAVE TO 
BE 1NCO 



JOB 


EITHER 


SWAPPED OR 


IN 


TRANS 1 T 


r 






SET DISP 

ADR FOR 
DLYCM 



; "PLEASE TYPE fC 
FIRST" 



COMER ] 5 "ILLEGAL IN BATCH 
JOB" 



COMER ) ; "ILLEGAL WHEN 

EXECUTE ONLY JOB" 



f \ GO DELAY COMMAND 

-^(COMDIS } UNTIL JOB IS IN CORE 






4-4 



CHKC01 



SPECIAL FOR 
SAVE COMMANDS 



I 




CHKYPN 
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I COMGO ) 



HERE WHEN 


READY 


TO 


GO TO 


CMO 


ROUTINE 



j COMER ) 



HERE 


ON 


ANY 


ERR. 


MSG 


ADR 


SET 


UP 




NO 



CLEAR 

REQUEUE BIT, 
CMWRQ 



\/_ 



CLEAR CMD 
WAIT BIT, 
CMWB 



COMDIS - 



-ML 



SET 


NOINCK 


BIT 


SET DiSP 


ADR FOR 


MSG ROUT 



DO SELECTED 
ROUTINE 



| DLYCM1 \ 




SET CMWB 
S JRQ BIT 
IF NOT SET 



DLYCOM 



JH. 



(cOMRET j 



INCREMENT 
SCANNER'S 

COMMAND PTR 
TTYCM 



( EX ' T ) 



*-t> 



HERE AFTER THE COMMAND HAS BEEN "DONE" 




NOTE THAT 
CMD HAS 
BEEN DONE 



TTYCMR 



INCREMENT 

COMTOT 
DECREMENT 

COMCNT 



CLEAR BITS 

SET NOINCK 

CMWRQ 



DON'T KEEP JOB# 





KEEP JOB # 




SET JNA 

FO R TH I S 

JOB 



ATTACH 
ttv m tna 

111 i \s \* \f<* 



TTYATI 



[ PCRLFA j 
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(PCRLFA J 



CLEAR OPR 
WAIT BIT, 
JDCON 




YES 



START SET 
TIME FOR 
WATCH 



WCHBEG 



YES 




NO 



4 



"^ 



PCRLF0 




ERROR OR 
NO JOB 



A-8> 




START JOB 
LINE AT USER 
LEVEL 



TTYUSR 



START JOB 
AFTER 10 
WAIT 

I ivusw — 



START JOB 
DON'T CHNG 
LINE LEVEL 



SETRUN 



START 
TTY 



TTYSTR 
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f 



NOJOBN 
SET? 



NO 



YES 



CLEAR TYPE 
AHEAD 



YES 





YES, 



-> 



KILL TTY 
ASS I GNMENT 
IF NO JOB 



TTYKLd 



^ ANY Ny 


^NO 






%. JOB# >* 








j'YES 








^/v$> l\ 


YES 






<^ CURRENT "^ 
^XJOB? /^ 




S 






J 






n\Q 




\ 


f 


^^CMWRQ^S^ 


NO 


\ 




<SET? 


s 


? 




\ 


'YES 




\ 


' 


MARK JOB 




TO BE 
REQUEUED 




\ 






J 




REQUE 





EXIT 



A-\0 



%l 



LEVEL 



1 
2 

3 
<! 
5 
6 
7 



UUQ 
USER 



JOB A 



MON I TOR 
CYCLE 

(2) 



CLOCK TICK 



EXAMPLE OF DELAYED COMMAND 



INITIAL STATE 



JOB C 



JOB A RUNNING 

JOB B STOPPED (COMMAND LEVEL) a SWAPPED-OUT 




CLOCK TICK 



(1) 



(4) 



NOTES : 

1. E COMMAND TYPED BY USER ASSOCIATED WITH JOB B. 

SCANNER SERVICE INTERRUPT CODE SETS CMDMAP BIT FOR THIS TTY LINE. 

2. COMMAND PROCESSOR STARTS TO PROCESS THE E COMMAND , THIS COMMAND HAS THE "iN-CQRE" 

BIT SET AND THE JOB IS SWAPPED-OUT, SO THE COMMAND MUST BE DELAYED UNTIL JOB B IS IN-CORE. 
COMMAND PROCESSOR SETS JOB S JRQ BIT AND CMWB BIT (LEAVING THIS TTY LINE'S CMDMAP BIT SET) . 

REQUEUING ROUTINE OF SCHEDULER PUTS JOB B INTO COMMAND WAIT QUEUE. (HIGH PRIORITY FOR SWAP-IN) 

SWAPPER PICKS JOB B FOR SWAP-IN. 

3. THE DELAYED COMMAND WILL BE PROCESSED AGAIN HERE IF NO OTHER COMMANDS ARE PENDING; IF SO, 
THE COMMAND WILL BE DELAYED AGAIN BECAUSE THE JOB IS STILL SWAPPED-OUT. 

t\. SWAPPER I/O COMPLETE INTERRUPT, SWAPPER CLEARS JOB B*S SWAP BIT INDICATING JOB B NOW IN~CORE. 

5. COMMAND PROCESSOR EVENTUALLY PICKS THE DELAYED COMMAND'S TTY LINE AGAIN AND SEES ORIGINAL 
COMMAND AGAIN (CMDMAP BIT STILL SET). THE COMMAND CAN NOW BE PROCESSED SINCE JOB IS NOW 
IN-CORE. AFTER PROCESSING THE COMMAND JOB MUST BE REQUEUED BACK TO ORIGINAL QUEUE (STOP QUEUE) 
THIS IS INDICATED BY THE CMWRQ BIT FOR THIS COMMAND. 



nrovt. • ■ \ o s.TAi i. ~r$t\ 1 k n \ o*os 



5HUCT TCPM WAIT S^TISFIEJ) 







SHOont TERM 



EVtMTS 



CTfe. - CHoSEM 
NCTft - 



~T~° Rom 



— — pat 3t>a swrc. -reAvjSi-TioMS 

X * P02- 3Xili STTqTfc Tt4AMSiTIDU5 

AUY Previous STWTE. 



SCHEDULER 2UEUES 




HP01 



{# defined in MONGEN) 





OUT 


IN 











PQ1 



Time 
Slice 

Expired 



NOKMAL 
EOTHJr 





OUT 


IN 












JBTOLS 
JBTJIL (Output List For 

(Just Swapped in List) Swapper) 

BE" 




SQ1S 


OUT 


IN 











Any 3Q can be designated as 3B) 



~0 



7.01 SCHEDULER FLOWCHART 



F-1 



F-2 



F-3 



F-4 



F-t 




CKJB5 



SELECTIVELY 
DEC ICPT & 
REQUE IF 
NECESSARY 



NXTJBX 



RESET CLASS 
QUOTAS IF END 
OF SCHEDULING 
INTERVAL 



NXTJ81 



CHK CURRENT 
JOB FOR QUANTUM 
RUN/MCU 
EXPIRING 



CKJS1 



REQUE. CURRENT 
J08 IF 

NECESSARY 



REQUE OTHER 
JOBS THAT 
NEED IT 



"REQUE" JOBS 
FOR SHAR EVM 
RESOURCE 



F-8 



DO ANY 

NEEDED 

SWAPPING 

SWAP 



F-3 



SCHED 



CHOOSE JOB 
TO RUN 
NEXT • 



F-9, F-10 



ALLOCATE 
SHAR RES 
(IF NECESSARY) 
TO JOB 



F-11.F-12 



c 



POPJ 



3 



MR-5001 



5-3 



7.01 SCHEDULER 

THE SCHEDULER IS CALLED FROM CLOCK1 AT C1P6 + 1 



(ENTER FROM ^ 
CLOCK 1 J 




DECREMENT 
COR SCO 




SELECTIVELY DECREMENT 
ICPT -RETURN TO NXTJBX 

NXTJ8XWHEN DONE 



RESET CLASS 
QUOTAS AT END 
OF MICRO 
SCHEDULING 
INTERVAL 



SCDQTA 



CAN 8E INCLUDED 
BY PATCHING 
RQTPAT/JFCL 



RECORD WANT TO 
RUN TIMES 




S-4 



F-2 
ICPT MAINTENANCE 



HERE TO START 
SUMMING NEW 
QUEUE 



icpt HAS EXPIRED 



HERETO 
' CONTINUE WITH 
SAME O BUT 
NEXT JOS 




SWITCH QUEUES 



ICPT 
HAS NOT 
EXPIRED 



STORE 
ICPT IN 
POB 



S"-s 



F-3 

THIS PAGES REQUEUES CURRENT JOB IF ICPT OR QRT 
EXPIRED AND THEN GOES TO REQUEUE ALL JOBS 




— — REQUEUE ALL NEEDFUL JOBS 



— SPECIAL HANOLING 



IF SO. JOB HAS BECOME 
SWAPPABLE AND IS TREATED 
AS IF QRT HAS EXPIRED 



YES 



QUANTUM RUN TIME HAS 

' EXPIREO 



RECORD 
OCCURRENCE 



RSPRC2 



REQUEUE JOB 
AND RESET 
uRT 



QARNOT 



CXJB1 

F-5 



F-4 

HERE TO REQUEUE CURRENT JOB IF NECESSARY 
AS WELL AS ALL JOBS IN JBTRQ 



HERE TO REQUEUE 
CURRENT JOB 



HERE WHEN CURRENT JOB 
DOESN'T NEED REQUEUING 



J8TRQISA LINKED 
LIST OF JOBS WITH 
JRQ SET AND AWAITING 
REQUEUE 




-7 



F-5 

THIS ROUTINE REQUEUES A SINGLE JOB 




GET DISPATCH 
ADDRESS FROM 
QBITS + WSC 



( OISPATCH ] 

THESE DISPATCHES 
ARE SHOWN ON THIS 
AND THE FOLLOWING 
TWO PAGES 



HANOLES 
DAEMON WAIT 
AND COMMAND 
WAIT 




PUT JOB IN 
COMMAND WAIT 
QUEUE 



QXFER 



C "* ) 



OAEMON 
WAIT 




£"-& 



F-6 

QREO OfSPATCHES 




— JOB STOPPING 



MARK JOB 
SWAPPABLE 



TAKE TRANSFER 
TABLE AOORESS 
FROM OBITS 



OXFER 




©■ 



' > S S ^ QUEUE S' 


I 




YES 


SETUP 
TRANSFER 
TBLE AOOR 
FOR STOP 
O 










' ' 






ZERO ICPT 




" 




TRANSFER 
JOB 




QXFER 





_ TTY I/O WAIT 
' SATISFIED 



I QREQX \ 



CLEAR WAIT 
STATE CODE 



TAKE TRANSFER 
TABLE AOORESS 
FROM OBITS 




QIOWT . . 
QDIOWT. 
OPIOWT. 
QNAPT . . 



I/O WAIT 
, . DISK I/O WAIT 
. PAGING I/O WAIT 
NAP ' 



EVENT WAIT 



I QEWT 1 
I QSLPT I 




CHECX MAOE SY CALL 
TO INRNQ ROUTINE. 



TAKE TRANSFER 
TABLE AOORESS 
FOR OBITS 




S"-9 



F-7 



QPST 
QDST 



JOB IN 



I/O WAIT SATISFIED 
PAGING I/O SATISFIED 
DISK I/O SATISFIED 



YES 



^N^ A HUN 0. ^r 

j NO 




PUT JOB INTO 
THE BACX OF 
P01 




' OCHNG 








» 




CLEAR WAIT 
STATE CODE 





I QREQX j 




YES 



JOB STARTING 
UP 







USE WSC 
INDEX OBITS 



ALL REQUEUING EXITS THROUGH HERE 




s-«> 



F-8 



HERE AFTER JOS REQUEUING TO MANAGE cVM 
RESOURCE ANO TO CALL SWAPPER IP APPROPRIATE 




TAKE ALL JOBS 
OUT OF EV.AU, 
DA WAIT 




IF SWAPPER IDLE ANO 
JOB WAITING TO BE 
LOCKED. TRY TO LOCK 
IT. 



LOCKO 



THIS CODE DETERMINES 
WHETHER OR NOT 
SWAPPING WILL BE DONE. 




GIVE BACK 

MJVI. RES 



PARTIAL CYCLE 



YES 




/ SCHED \ 



s-n 



F-9 

HERE TO CHOOSE A SCAN TABLE 




CLEAR POTENTIAL 
LOST TIME FLAG 
I.CPPLT) 



SO RUN NULL 
JOB 



CPUO ■ SSCAN 
CPU1 ■ SSCAN 1 




GO CHOOSE A JOS 
FROM THE QUEUES 



GO AND TRY 
TO RUN FORCED 
JOB 



•REALLY A JRST TO NISCHED IN CPNSER. 
IN GENERAL. MSCHED WILL 00 A 
PUSHJ TO SCHEDJ. 



S-\Z 



F-10 

SCAN THE QUEUES CHOOSING A JOS TO SUN 




FINO A 
_ RUNNABLE 
"job IN 

THE QUEUES 



SELECT A 
JOB FROM 
THE QUEUES 



OSCAN 



HERE WITH FORCEI /^ "N. 

X38 OB WHEN T/S / \ 

TURNED .OFF ANO [ SCH6DS \- 
JOB STILL RUNNABLE 






CAUSING 
ICPT TO BE 
DECREMENTED ■ 
THE NEXT 
CYCLE 



I SCHEDC \ 




SET .CPPLT 




CJFWRX 
' ROUTINE 




THIS PATH 
IS TAKEN 
WHENEVER 
'a JOS 
IS 
REJECTED 



IF A FORCED 
JOB. INCREMENT 
FORCFC 



^r JOS ^s. 

<r FROM ^s 

^^subqueues/ 

Tyes 


NO 


SET .CPSQF 
(TEU CLOCX1 
SO IT CAN 
ADJUST CLASS 
QUOTAS) 










' 






MAINTAIN 
RESPONSE 
DATA. CLEAR 
CPPLT AND 
CPCLF 





RESET 
FAIRNESS - 
COUNT 




INCREMENT 

FAIRNESS 

COUNT 



c e y — 



LEAVING 
JOB * 
IN J 



S-\2> 



F-11 




HERE WHEN WSC * FOR CANDIDATE JOB 
TO RUN - EITHER GIVE HIM RESOURCE 
OR GO UNWIND 



_ HERE WHEN 

* wsc y 



IF FORCE JOB 
WITH JXPN=*I 
LET IT RUN 



CJFRCX 



CLEAR 
•- — JXPN 

TEMPORARILY 




SET 

UNWIND FLAG 



NOW WE UNWIND 




UNWN01 
F12 



MAINTAIN 
SHARABLE 
RESOURCE 
OATA BASE 



GIVE RESOURCE 

TO JOB 

AND UNWIND 

UPDATE 

■ AVTBMQ. 

USTBMQ 



CLEAR JOB'S 
WAIT STATE 
COOE 




RUN THE 
"JOB 



£'\* 



F12 

HERETO UNWIND RESOURSE EITHER 
UNWIND UP TO 10 LEVELS DEEP 
OR GIVE UP JOB 



RUN THE 
BEST JOB 

TO FREE 

THE DESIRED 
RESOURCE 



__ UNWINDING 
SUCCEEDED 



GIVE UP ON 
TRYING TO 
UNWIND FOR 
THIS JOB. 



JUST GO 
CHOOSE 
ANOTHER JOB. 




UNWNDF 



GO ONE LEVEL 
DEEPER IN 
UNWINDING 



S-\S 



F-13 



HERE WHEN NO JOB CAN 3E FOUND TO RUN- 
DETERMINE IF LOST TIME FLAG SHOULD BE 
SET THEN RETURN J08 



YES 




NO 



POINT TO OUT 
OF CORE SCAN 
TABLE - LSCAN 




( exITT) 

V WITH NULL JOB / 



r-\t 



F-1 

DETERMINE WHERE WE LEFT OFF ON THE 
LAST PASS ANO WHAT TO OO NOW 



7.01 SWAPPER 




RETURN IPCF 
PAGES 



GVtPCP 




YES 



NO 



/ SWP2 \ 



CLEAR FIT, 
REMEMBER 
TIME IN POB 
OF VICTIM 



ZERFIT 



FIND COMPLETED 

SWPLST 

ENTRY 



FNOSLE 




YES 



/ FINOUT \ 



MR-5015 



fe-1 



F-2 

PICK A JOB TO SWAP IN 





PICX A JOB 
USING 

I SCAN 



O.SCAN 




REMCMBER 
JOS NUMBER 



b-2. 



F-3 

SWAP IN JOB CHOOSEN 

NOW BRING IT IN 




FITSIZ 




fe-3 



F-4 



ESTABLISH THE PROPER SCAN TABLE FOR SWAP OUT JOS 
SELECTION AND DECIDE HOW MUCH OF THE TABLE TO SCAN 




SET FLAG TO 
IGNORE ICPT 
DURING QUEUE 
SCAN 



CLEAR FLAG 
SO ICPT WILL 
BE CONSIDERED 




CAUSES QUEUE 
SCANNING TO GO UP 
TO AND INCLUDING 
JDC IQFOR1 




SET QUEUE SCAN 
TERMINATOR TO 
OSCANT 



CAUSES QUEUE SCANNING 
TO GO UP TO AND INCLUDING 
THEQUEUETHE FIT 
JOB IS IN 



SET QUEUE SCAN 
TERMINATOR TO 
QSCANTQ 



POINT TO 
OSCAN 




OSCAN HAS THE FOLLOWING ENTHIES 



SEARCH LABEL 


QUEUE 


SEARCH CRITERIA 


OSCAN 


STOP 


IQFOR 




SLP 


IQFOR 




EW 


IQFOR 




JOC 


IQSAK 1 




Tl 


IQFOR 


OSCANT 


JOC 


IQFOR1 


QSCANTQ 


PQ2 


OLFOR IINCLUOES IQBAK1 




PQ1 


IQSAK 




CMQ 


IOBAK 




HPO? 


IQSAK 



fc-4 




F-5 

SELECT A JOB FOR SWAP OUT 




SAVE AS 
JOB TO SWAP 



THIS WLL BE 
THE SINGLE JOB 
ACTUALLY SWAPPED 
OUT 



GO LOOK FOR 
MORE JOBS 
UNTIL ENOUGH 
CORE COULD 
BE POTENTIALLY 
FREEO 



CORE FOUND 



t A 3 

Tyss 




SAVE 
STATISTICS 
















SET TIMER BIT IF 
TIMER GONE OFF 
ANO JOS IN PO AND 
START 1CPT COUNTING 





/forcoo \ 



fe-S 



F-6 

DETERMINE I* JOS SELECTED FOR SWAP OUT 
CAN BE SWAPPED OR MUST IT WAIT POR 
I/O TO STOP Oft SHARABLE RESOURCES 
TO BE GIVEN UP 




THE EFFECT OP THIS ENTRY 
CHANCES ALL DISPATCHES 
TO EXIT THE SWAPPER 
THAT WOULD OTHERWISE 
SOTO A PASS P-S 





START 

TIMER 



ENTRY POINT TO FORCE 
OUT IOLE HIGH Sees 



MARC JOR 

AS HUNG WITH 

ACTIVE I/O 

I 

GIVE UP ON 
SWAP OUT FOR 
THIS JOR 

f FLGUNl \ 



£>-<& 




F7 

00 SOME PRE-SWAPOUT HOUSEKEEPING 
ANO THEN BUILD THE SWPLST ENTRY 



DELETE JOB FROM 
OUTPUT SCAN LIST 
IOLSI 



DELETE JOB FROM 
JUST SWAPPED IN LIST 
JIL 




HOUSEKEEP 
JXPN 
PRETEND 
SWA PBWT 



FIXXPN 



y^ GET N 

^^ IT ^ 

■ T YES 


w NO 






I 

1 




STORE J08 # 
IN FORCE 
TRY AGAIN 
NEXTTICX 


BOSLST 
BUILD AN 
OUTPUT SWPLST 

ENTRY 










■ 


[ FLGNUL \ 


INCREMENT # 
OF SWAPS IN 
PROGRESS. 
CLEAR SWPPLT 


^TJ 


- ■ 




START REQUEST 
IF POSSIBLE 




SQOUT 




> 




HOUSEKEEP 
JXPN 




FIXXPN 




[ CHKXPN \ 





[ FINOUO ] 



NO NEED TO 
SWAP OUT IF 
USING NO CORE 



>-7 



F-8 

GET CORE FOR SWAP IN 
JOB ANO MAKE SWPLST 
ENTRY 



SWAP IN 

THE CHOSEN 

JOB 




SET UP TO 
SWAP IN THE 
UPMP 

BUSLST 



INCREMENT f 
OF SWAPS IN 
PROGRESS. 

CLEAR SWPPIT 



START I/O 
IF POSSIBLE 

SPIN 
! 



( "" ) 



fe-8 




fe-9 



F-10 

SWAP IN HOUSEKEEPING 




FININH 



NO 




GIVE BACK 
DISK SPACE 



GIVBAK 



FININ 
ROUTINE 




NO 



YES 




ADO AMOUNT 
OF DECREASE 
TO AVAILABLE 
VIRTUAL CORE 



MARK JOB 
IN CORE 
AGAIN 



UNSWAP 





C 



fo-\0 



F-11 

CONTINUATION OF SWAP IN HOUSEKEEPING 



i 




ALL DONE SWAP IN 
HOUSEKEEPING 



ASSIGN OAT 



MAKE SURE JOB NOT 
ONOLS 



CLEAR BACKGROUND 
BATCH 




JOB IN 

BACKGROUND 

BATCH 



SET BACKGROUND 
BATCH 



S^ job in ^O; 
\^ POJ ^^~ 

JVES 


O 
ES 


J^ job IN ^VVES 

V^ JBTJIL J> " 

JNO 


y^ job in ^*SJ! 


1 PUTIN 
| JBTJIL 




N. JBTJIL ^f~ 









[no 




PUT JOB IN 
JBTJIL 





















IF SWA POUT 
ERROR GO 
TO OUTERR 



DELETE 

SWPLST 

ENTRY 



DLTSLE 



SETJS.MIG- 

I.E. COMPLETELY 

MIGRATED 



F1NOU2 



RETURN CORE 
FOR SEGMENT 



KCORE1 



F-12 

SWAP OUT HOUSEKEEPING 



FINOT 
ROUTINE 





GO DO 
LOW SEG 



MR-5026 



fe-VL 



F-13 



HERE WHEN ENOUGH CORE CANNOT BE FREED BY DELETING 
IDLE & DORMANT AND ENOUGH ELIGIBLE JOBS CANNOT BE 
FOUKS TO SWAP OUT 




SAVE JOB 
NUMBER 



START 
TIMER 




COUNT N8R 
TIMES 

FRUSTRATED 



FLAG AS 
FRUSTRATED 



USED BY SCNJOB TO 
IGNORE QUEUE POSITION 
ANO ICPT EVEN IF 
HIGHER PRIORITY 



( FLGUNL ) 



t>-\2> 



TOO FLOW 



unjo THA? J 



Y ... 

save Acs 




SO 



XS3 



LASSS- 



PZHFCRM 

REQUESTED 
OPSRATION 




S3 



LA238 



sssToas 

AC'S AMD 
?C 



HXTOTH ] 



XABX JOS 

9} TO 32 
R2QUEOZD 



asscssD 



iscaso 




RE3CHED 



USCH2D 



aspcar 

IT 



I 



35SCHSS 



r 



X 



S2SB23T 
SC5TSASE 
CLOCS 137 



~7-\ 



PRF1 TMTNARY ffl UJQ TRAP CODE 



COMMON 





C MUUO 


) 






-SWITCH AC BLKS. 
-SET' UP P4 FOR 

CORRECT CDB 
-COUNT #MUUO. 




1 






GET PC 

FROM 

UPMP 425 





HERE ON KERNEL, CONCEALED OR 
.PUBLIC NO TRAP TRAP 




YES 



SET UP 
PDL FOR 
JOB 



UUOSY1 
(UUOCON) 



UUOSY1 
( UUOCON) 



DO UUO 



EITHER DOORBELL 



OR ERROR IN 
NULL JOB 




UNJ 
STOPCODE 



) 



DO UUO 



1'Z 



Detecting the CPU Doorbell 



r "s 

i MUUOiA J 

1 




COUNT THE 
NUMBER OF 
DOORBELLS 




' 






SET PC 
FLAG; 
iC.UOU 












<r PROTOCOL ^> 


YES- 


DSKTIC 
TAPTIC 




V. DOORBELL^/ 




START I/O FOR 
DISK, TAPE 






NO 










1 


> 


NO 






/scheduler^ 

"v. DOORBELL .• 

JYES 


^RETURN TO A 
^NUUOB J 


CLEAR BITS 






■ 


STORE NUUOB 
PC AS THE PC 






' 


J 'CHECK FOR A RUNNA3LE JOB BY 
DISPATCHING TO THE 


( CLKSPD J 


SCHEDULER; CACHE IS SWEPT 
IF NECESSARY 









-7-3 



oUO - Preprocessing, Dispatcn and Exit 



UUOSY! 



J, 


GET uua 
INTO AC M 






MOVE JOB* 
INTO J 




SAVE RETURfl 
ON PD LIST 



MPUUOt : 



.YP<? 


i 

-PRINT ERROFJ 








UUOERR## J 




| 


( EX1T ) 




YES 



STOP JOB 
PRINT ERROR) 



ILLINS** 



SETUP USER 
CHANNEL » 



( EXIT J 




NO 



.ONG 

i ISP. TABLE > YFS 
UUO 



NO 



\£ 





YES \, 



CHECK TH 



NQ 



UUQCHKrf* 



£" 




YFS DISPATCH 
TO 
FUNCTION 



NO 



V EXIT ) 




YES 



HANDLE 
TRAP ON 

ANY UUQ 



ANYUUO** 



DISPATCH 

TO 
FUNCTION I 




NO 



"* 



USRXT1 j 



YES 



ADJUST 
STACK FOR 
SKIP RETURN 



USRXI 



Ti j A. 




"7-^ 



Q 




YES |* 



DQ MONITOR 

OVERHEAO 

CYCLE 



USCHOl** 



7" 



USRRET t i 



NO 



f 






PROCESS 

EXEC 

AOOR. BREAK 



EXCABK4* 



EXECUTE 
PENDING 
TRAP 
ftrJSTRUCTIONJt 



—9 



( EXIT ") 
V UMPRETfl«y 



"7-S 





SET FOR 
SKIP 

PFTHPf 




STOPCD 



MACRO 



STOPCD 



CONT, 



TYPE, 



NAME. 



DISP 



CONT 



LOCATION TO JUMP TO AFTER 
PROCESSING ERROR 



TYPE 



TYPE OF FAILURE, USED TO 
DETERMINE NEXT COURSE OF 
ACTION 



NAME 



HALT 

STOP 
JOB 
DEBUG 
CPU 

UNIQUE THREE LETTER NAME, WILL 
BE EXPANDED TO FORM GLOBAL LABEL 
S. .NAME 



DISP 



ADDRESS OF ROUTINE TO TYPE 
ADDITIONAL INFORMATION (USUALLY 
NOT SPECIFIED) 



7-7 



CODE GENERATED VIA 



STOP CD MACRO 



HALT TYPE 



STOPCD CONT, TYPE, NAME 



S, .NAME : : HALT CONT 



"7-8 



CODE GENERATED VIA 
STQPCD MACRO 



DEBUG, JOB, STOP TYPES 



- IF CONT IS A SYMBOLIC ADDRESS 



STOPCD CONT, TYPE, NAME 



V 



S.. name :: PUSHJ P,DIE 

CAIA type, (SIXBIT/name/)(17) 

JRST CONT 



- IF CONT IS . OR .+1 OR CPOPJ OR CP0PJ1 

STOPCD ., type, name 

n 

V 
S,.name :: PUSHJ P,DIE 



CAI TYPE, (SIXBIT/name/) (continue type) 



7-9 



SOURCE CODE 



ROUT: 



CONT ; 



EXPANSION 



ROUT: 



s. .NAME: : 



CONT: 



CODE GENERATED VIA 
STOPCD MACRO 

RECOVERABLE STOPCD 



CONDITIONAL TEST j everything 0!< ? 

STOPCD CONT, TYPE, NAME j NO, 

I ; YES. 

POPJ P, 



j SYSTEM CONTINUE 
j ROUTINE 



POPJ P, 



CONDITIONAL TEST 

PUSHJ P,DIE 

CAIA type, (SIX3IT/name/)(17) j param for DIE 



JRST CONT 

i 

POPJ P, 

POPJ P, 



j where to go 

IF WE COME BACK 



-7-10 



CODE GENERATED VIA 
STOPCD MACRO 



SOURCE 



NON RECOVERABLE STOPCD 



ROUT: 



CONDITIONAL TEST 

STOPCD ., TYPE, NAME 



j DON T COME BACK 



EXPANSION 



POPJ P, 



ROUT: | 

COMDITIONAL TEST ; everything OK? 
S..name:: PUSHJ P,DIE j no, DIE 

CAI type,(SIXBIT/name/)(1)j yes, NO-OP 

J j CONTINUE 

POPJ P, 



7-M 



Ef£E£I_QE_SIQE£DJIEES 



TYPE 



LEVEL 



PI 



NON-PI 



DEBUG 



JOB 



CONTINUE 

SYSTEM 

JOB 
ABORTED 



STOP 



HALT 




CPU 



SINGLE CPU 

OR 
LAST CPU OF 
MULTI CPU 



RELOAD 



MULTI-CPU 



JUMP INTO AC 



FINDING THE FAILING CYCLE 



PISTS: CONI PL PISTS 



BEFORE THE CRASH 



J 

i 



PISTS: 010000.. 150377 



AFTER THE CRASH 



t 
PI ACTIVE 

ALL CHANNELS 
ON 

L INTERRUPTS IN PROGRESS 
ON CHI AND CH3 



-J 



PUSHDOWN LISTS 

UUO CYCLE c=C> 3/40510 

MONITOR CYCLE t=£> NULPDL 

DEVICE INTERRUPT CYCLE c=C> C1PD1 

C2PD1 
C3PD1 
C'lPDl 
C5PD1 
C6PD1 

INTERRUPT LEVEL — 



* Typed by FILDDT as ONCPDL + n 



HQO0PS10 DIES. 



PEPOSIT NON-ZERO 
IN LOCATION 30 



START AT 
407 



CLOCK 
INTERRUPT 



J 

I 

(A 



SYSTOP 



IN-LINE 
STOPCODE 


APR 
ERROR 














APR 
INTERRUPT 




^_^— -- — - 




(1) 
DIE 





PAGE FAULT 
UUO ERROR 




REBOOT 



I 



(2) 
ENDSTS 



(1) Depending on the contents of the STATES 
and DEBUGP the DIE routine may kill the 
job (ZAPJOB) and continue the system. 

(2) If BOOTS cannot be found the system will 
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NOTES ON INIT 

1. DVASRC — Generic Device Search 

On the first generic device search, we are only trying 
to verify the existence of a device of the specified type at 
an appropriate station. If the user is spooling the device, 
this is all we need in order to let the INIT succeed. If he 
is not spooling, we must find a device which is available to 
him. This is the purpose of the second call to DVASRC. 

On the generic search we look first at the user's own 
station. If no such device exists at his station, we look 
at the central station. We try to find a device ASSIGNed to 
this user, but not INITed. If that fails we attempt to find 
any free device of the correct generic type at the correct 
station . 

There are four possible outcomes: 

1. Find a device ASSIGNed to this user but not INITed 

2. Find a free device 

3. Device exists, but not available 

4. Device does not exist 



Note that if the device exists at the user's station 

but is unavailable we get result 3. However, if the user is 

at a remote station and the device does not exist at his 
station, we look at the central station. 

2. If the device should be unavailable at the time we try 
to assign it, this flag says we should come back and 
look for another. (Normally will not happen.) 

3. This is relevant only to nondisk systems. 

4. This routine is used by both the INIT UUO and the ASSIGN 
command . 

5. This flag is used when we must distinguish the real 
system device from a device assigned logical name SYS. 
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6. The device name TTY always means the job's controlling 
TTY. 

7. e.g., LPTS1 

8. Unless we found a DDB that was ASSIGNed but not INITed 
by the user, we will set up a new DDB by copying the 
prototype disk DDB. We copy DEVNAM from the DDB which 
we found. If we found this DDB on a logical device 
search, DEVNAM will match the physical device name 
specified on the ASSIGN command which set up the logical 
name. Otherwise, DEVNAM will match the argument of the 
INIT. 
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NOTES ON INPUT 

1. This will always be true unless the user is 
changing the structure of the buffer ring. 

2. Mark the user's current buffer as now available to 
the device interrupt routine. 

3. Check IOACT in DEVIOS. 

4. Except for TTY, this is a check if the buffer ahead 
of the buffer we are about to give to the user is 
empty. Hence, for a N buffer ring, we start the 
device when N-l buffers are empty. 

For TTY we check the same buffer which we are 
giving to the user. The TTY device dependent 
routine does not actually "start the device," but 
copies characters from the monitor TTY buffer to 
the user's buffer. See SCNSER flows for details. 

5. WSYNC sets the job's wait state code to 10 Wait and 
calls WSCHED. The job is stopped at this point and 
its stored PC will say to restart it after the 
PUSHJ to WSYNC. The interrupt routine must get the 
job out of 10 Wait when the next buffer is full. 
WSYNC will give an immediate return if IOACT is not 
set. This allows us to give the job an "error" 
return on end of file. 



NOTES ON OUTPUT UUO 

1. Normally the user's first OUTPUT UUO will take the 
NO branch. Its only function then is to set up the 
buffer ring and initialize the buffer control 
block . 

2. Unless the user set the IOWC bit, we compute the 
buffer word count by looking at the byte pointer in 
the ring header. 

3. Mark the current buffer as available to be written 
out . 
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4. Check IOACT 

5. See WSYNC note for INPUT 

Notes on CLOSE 

1. WAIT1 will repeatedly call WSYNC until the device 
is no longer active. Hence, it holds the job in 10 
Wait until all buffers have been released by the 
interrupt routine. 

2. Hence, after CLOSE it will appear that the ring has 
been set up but not used. 

3. Device dependent routine for dump mode input close. 

4. Device dependent routine for buffered mode input 
close . 

5. Device dependent routine for dump mode output 
close. 

6. Ensures that all buffers are written. 

7. Device dependent routine for buffered mode output 
close . 



Notes on Release 

1. Hence, RELEASE implies a CLOSE for the same 
channel . 

2. This will normally give an immediate return, since 
CL0SE1 also called WAITl. 

3. Device dependent routine for RELEASE. 

4. This applies only to non-disk systems. 
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INTERRUPT ROUTINE CHAIN 



40 + 2N: JSR CH'N 
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CH ' N : 
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DEVI' INT: CONSO DEVI, Conditions 
JRST DEV2 ' INT 

Process DEVI Interrupt 



DEV2 ' INT : CONSO DEV2 , Conditions 
JRST DEV3 ' INT 

Process DEV2 Interrupt 



DEV3'INT: CONSO DEV3 , Conditions 
JEN @CH'N 

Process DEV3 Interrupt 
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S.17 NON-STANDARD DEVICE PI ASSIGNMENT 

Under ordinary circumstances when COMMON is assembled, devices are 
assigned to PI channels according to their group priority- {Refer to 
Table 8-1.) If you have at your installation a device not listed as a 
standard device in Table 8-1 and you have written your own Monitor 
Device Service Routine, you must specify the device mnemonic (in 3 
characters or less) and designate an appropriate priority interrupt 
channel. You must answer all three questions as they apply to your 
configuration. The first question 

TYPE "DEVICE-MNEMONIC, PI -CHANNEL" FOR SPECIAL DEVICES 

requests special device service routines that do not need either a 
Channel Save Routine or a Device Data Block. The second question 

TYPE "DEVICE-MNEMONIC, PI -CHANNEL, NO. -OF-DEVICES" 

requests devices with special service routines that have a Device Data 
Block but no Channel Save Routine. The third question 

TYPE "DEVICE-MNEMONIC, PI -CHANNEL, HIGHEST-AC -TO-SAVE" 

requests devises with special service routines that have a Channel 
Save Routine, but no Device Data Block. 

Special devices that ycu added during the HDWGEN dialogue are chained 
to the requested channel. To give a device the" exclusive use of a 
channel, you respond to the "symbol , value" question with 

UNIQn,l 

where n is the priority interrupt channel to be reserved. (Refer to 
the UNIQn,l entry in Section 8.14.1.) 



One or more priority interrupt channels may be reserved for real-time 
devices with the RTTRP monitor call. These devices are completely 
controlled by user programs and have no specific code loaded with the 
monitor. To reserve a priority interrupt channel for use with RTTRP, 
you should respond to the "symbol , value" question with 

RTCKn,l 

where n is the priority interrupt channel to be reserved. 

(Refer to the RTCHn,l entry in Section 8.14.1 and to the DECsystem-10 
Monitor Calls manual.) 

I/O devices are grouped by their relative interrupt speeds. . If any 
device of a particular group is present, a PI channel is assigned to 
that device according to its group priority. Group priorities for 
standard devices may be revised by rearranging the devices in INTTAB, 
which is in the COMMON source file. 
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Introduction and KL Orientation 

M flo « ffl ?lJK' P r °csssor is the basis of the high-end D£Csystem-13 line 
(1380,1393) and the -20 series systems (2043, 2053}. - Each of these 
systems contains five subsystems: 

• Ebox 
o Mbox 

• Memory 

• Front End 

• jJ/O devices 

The diagram on the following page illustrates the basic 
configuration of the KL's subsystems. 
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TTTTTTTT 
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Data channels 



I/O subsystem 



KL Configuration 
Figure 1 
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oroc.«inf°o£ ( !™ ti0n - bOX) is VrimMeilY concerned with the 
SeaSr? ™L»I I^ m ,«»tructiona. It fetches instructions from 
!S?iI: C8a ?^!, *"? ctive addresses, and performs instruction 



actions i^^-f*.^— it . '■ fsiJ-utiua instruct 

f™! :*• Addltion ally, the Ebox controls all devices bv 
transmitting control information through the Ebus, and it turn 
receiving interrupts and device status along the same route 



transmitting control information through the Ebus, and 

^tT^S ln X rrU?tS a * d device status along the same route. 

£?*££« ^ Eb ° X contro 4* **** transfers be*w*-2 devices and memo?; 

for tnc-se day ices not usimr * ****-» ,-h m „,o ^ rj 



is not using a data channel. 

reque^I ^Far insSf*^ 6 lu* coordi nating physical memory 
rSuIJS" v.5? instance, the Mbcx oust service all Ebox memory 
requests.. Moreover, on DECSYSTEM-20. and 1393 systems the data 

ohvstcaf a mLcS nneCt f-, tQ ? e Mb « rat * er than being Si feed to 
ShvItJS m »«»ofy. Aside from its function in handling 

iobs ^Ir^ f h r «^ eStS ' • the Mbcx haS two r * lated a * d significant 
JirJ^i Si to ° X 1S the ° nly systeffl component that translates 

virtual addresses into physical addresses. Second, the- Mbox 
contains and controls the cache memory. ' HBox 

The front-end subsystem comprises the PDP-11, associated -11 
devices, and the DTE23 (which interfaces the -11 to the-13 " Ebus) 

tL Ir^ 7 IeaSt the " X1 is ™«P<n«ibl« for overseeing operation of 
tne_ XL processor. This responsibility extends to requiring the -11 
1° if if 1 *!;"? 1 * ~ la,s: memor Y and micromemories during bootstrap.. 
Support of the operator's terminal is associated with these twkll 

iatSfnJ ll7 '^ ECSYSTEM *? 2s Place a11 unit record ■*» communtciXons" 
ISiSS^^SL contro1 of ^e front end -ll v or of other -lis 

acacned to a. DTS. 

*<- !? S r u su ° s y s tem includes all I/O devices that are controlled 

!«n*-SL * 7 ? e KL ~ 13 - Such devicas invariably iaclS. JSk 
ff° r . Md Ca?e strollers. Additionally, DECsvstem-13s place 
unit recora equipment and DECtapes in the I/O subsystem. (In other 
systems, such devices belong to the front end -11. ) To provide £il 
support, -13s need an additional device called the DIA. 

, T , Finally, _ the memory subsystem contains ohvsical core memory. 
(It does not induce the cache; cache is located in the Mbcx.) 

*»,. ^ii KL "? ased ^sterns contain these five subsystems. However, 
.»««. lnt3 £ nal3 °f a- particular subsystem might vary with the tyoe of 
system. For instance, a 1380 Mbox will have cache* while 2343 Mboxes 
Sif*™* 2L?* Ultsr f st of, clarifying the distinctions between the 
different systems, each section of this document will describe the 
appearance of the subsystems for each type of CPU. 

Here is a summary, by subsystem, of optional KL-based system 

components. J 
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^ff 3 1390 2343 2350 
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Paging on a KL Processor 

This section describes the different types of paging available 
on KL_ processors-. Section 2.1 concerns so-called* Kl-style paging, 
which is the. scheme implemented on KL-10 processors [1382, 1383, 
1390, 1099). Section 2.2 explains KL-stvla paging as implemented on 
XL-£3 processors {2*40, 2353), 

Before discussing paging, it. would, be well to quickly review KL 
address management. This discussion frecuently 'refers to the KL 
subsystems described in Section 1, and you might find it useful to 
consult Figure 1 as needed. Another available aid is Appendix A, 
which contains a glossary of commonly used paging terms. 

Three different types of address are possibler ohysical, 
executive virtual, and user virtual. Physical addresses are 22 bits 
long and denote a word in the physical address space. The physical 
address space can contain as many as four million words. The 
average programmer rarely encounters physical addressing, but a 
study of the KL requires a knowledge of where physical addresses are 
used. There are four circumstances that deserve attention! 

1.. All requests ta the memory subsystem must take the form of 
physical- Thus any request made by the Mbox using the Sbus 
must have been- translated, by the Mbox,. to a physical 
address. Also, any transfer involving an external data 
channel, has to be initiated in terms of physical addresses. 

2. Certain address, inputs to the Hbox are expressed as 
physical, addresses. Most significant are requests for 
(Cbus) transfers between RH23 controllers and the Mbox When 
a monitor program needs to initiate disk I/O, for example, 
the monitor must convert the address of the I/O buffer from 
virtual to physical before the transfer is started. The 
channel (i.e., the Cbus) then controls the passage of data 
from the physical addresses specified. " Note that the 
treatment of internal, data channels is thus logically 
consistent with that of external channels: both tvpes 
require physical addresses. 

3. A tiny subset of Ebox-to-Mbox requests is expressed in 
terms of physical addresses. The only physical Ebox 
requests are several (but not all) operations originating 
in the front-end subsystem. 

4. Finally, most diagnostic-bus communication involves 
physical addresses. 

Another class of address is that of exec-virtual. This address 
is 18-bits long and is converted ' (by the Mbox) into a 22 bit 
physical address before it is sent to the memory. An address 
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reference from an instruction is treated as exec-virtual if it- 
originated man instruction being performed while "S processor is 

-xel-vtrtua" to i!U? # V -^ ° r • su ^ rvisor > ■ «>• traSsHtfon fro" 
:3. Indl?! f1r P Sl? al **"" '" deScrib * d in S *<=tion 2.! "for 

8(5( , ro S I/0 J;^^ 3 «« errpressed in terms of exec-virtual 
aocr esses. The only requests that are not exec-virtual «e 
data-channel requests (which are physical addressed as de*Li"I 
Z£IV T S ° me f eal "^ ime transfers (which could take place'inl/O 
accomolisf t/S^ 16 ° f - the USe of «ec-virtual addresses to 
uniHecIrd IQi^n*™^ *"*'— la * of ^tape, paper tape, or 

are exec-Sr^f " tly <L a lar ?e proportion of instruction references 
!L ac-virtual. Specifically, any instruction executed in the 

TO reference^ ^nsideT ^ /"^"^ «"' exec-virtual 

al-Ia^s oo^^% a P ^°^ C ? nt 5i na an 13 ~ bit «unt«r/and thifcounter 

exic-vi-tnJi 2 Ii rtMl address - The address will be treated as 

wh^n ZJf Si en the .P ro «ssor is in an exec mode and user-virtual 

wnen the processor is in user mode. Therefore, fetching an exec 

Of course^man^ni! 3 ** exec " virtu ^ " to - physical t1ans?at!Jn? 
Ihina J«%°! £ instructions cause other memory references, thus 
adding to the total number of translations that must be made. 

n« S r 3?!> t ? ird J . and las t class of address is user virtual. 
User-virtual adcresses are 18-bits long (like exec-virtual) , and 
al a o require translation to physical addresses before memory can be 
read or written. Programs running in a user mode (either public or 
concealed) use user-virtual addressing. The translation of a 
user-virtual address to a physical address is described in Section 
2.1 for -13s and 2.2 for -20s. 

user-^r^nJ?? 36 threS * typeS (P h y s i=al, exec-virtual, and 
user-virtual), exec and user-virtual addresses are the tyoes most 
frequently encountered by the system programmer. Any virtual 
request, must be translated into a physical address by the Mbox. 
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It should be noted, that an Ebox-based memory request might be 
any of the three types of address- The address will rarely be 
physical, but occurrences of. exec-virtual and user-virtual requests 
are frequent. The type of address used depends on the circumstances 
of the- request. For example, suppose that the Ebox has just 
finished processing an' instruction - It must now. read a new 
instruction from memory. The address of the new instruction is 
found in the processor's PC (Program Counter). Suppose the PC holds 
the number 301401.. In this case, the Ebox muse request the contents 
of address 301401 from the Mbax.. 

But what kind, of address is. 901401? It cannot be physical, if 
for no other reason than because physical addresses have 22 bits and 
the PC has only 13 address bits. Therefore the address must be 
either user- virtual or exec-virtual, but which? 

The answer depends on the processor's mode when the instruction 
fetch is done. If the processor is in exec mode (as reflected by PC 
bit 5 being 0) then 031401 must be treated as an exec address. On 
the other hand, if a user program is being run then PC-bit 5 is 1, 
indicating that the processor is in user mode. In that event, 
001401 is a user address. ' 

At the hardware level, the Ebox makes its request by sending 
the address (301401) to the Mbox across the E/M interface. 
Additionally, the Ebox must tell the Mbox what kind of addressing is 
needed. (The Mbox cannot determine this directly because PC bit 5 
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determines processor mode, and the PC is in the Ebox.) This is 
accomplished by the Ebox sending an additional signal to the Mbox 
specifying the address mode. 

Another example involves an instruction like "ADD 5,1700™, The 
Ebox must obtain the contents of 1700 to perform the addition. As 
before, this r-equir-es that the Sbox. set up -the address (1790) on the 
£/M inter face . And Ebox must again inform the Mbox of the desired 
addressing mode scheme (user or exec) . The type of addressing is 
still dictated by PC bit 5. Thus a user or oar am executinc the 
instruction will cause 1700 to be treated as a user-virtual address, 
while the. same instruction performed in the monitor would make 1700 
be an exec address. 

Amid this sea of confusion there is an island of fact: the 
only part of. the system that converts one type of address to another 
is the Mbox. If the Ebox supplies a user-virtual address to the 
Mbox, that is because the Ebox found the user-virtual address 
elsewhere. Similarly, if the Ebox feeds the Mbox a physical 
address, then the Ebox was given a physical address by something 
else. The Ebox cannot take a virtual address and translate it, for 
that is the sole province of the Mbox. 

Cache memory is a different matter altogether and has no direct 

bearing on the paging concepts just described. Cache provides a 

means of eliminating roughly 90% of the possible references to 

physical core, thus speeding up CPU operation by a substantial 

margin. Please note that the cache contents are indexed by physical 

address, therefore cache is only accessed after a virtual' address 

has been translated to the corresponding physical address. 

Section 2.1 describes the specifics of DECsystem-10 paging, 
while Section 2.2 provides information on DECSYSTEM-20 paging*. 



2.1 DECSYSTEM-10-STYLE PAGING ("Kl-STYLE") 

DECsystem-10 paging is the paging scheme supoorted by systems 
running TCPS-13 (1080, 1090). 

Under KI paging, the processor has two "process tables". The 
User Process Table (UPT) controls the mapping of all user and some 
exec pages. The Exec Process Table (EPT) is used for most exec 

addresses. 

These tables are also referred to as the user/exec page maps or 
the user/exec page map pages. 
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Kl-Styie Paging 
Figure 3 
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The basic mapping process involves translation of an 13-bit 
virtual address into a 22-bit physical address. In this process the 
virtual address is treated as a 9-bit virtual page number and an 
adjacent 9-bit "■offset" into the page. 



-13 bits- 



Virtual page number 



Offset 



Virtual Address Structure 
Figure 4- 
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The manning hardware removes the 9-bit virtual page number (the 
"VPN") from* the address, uses these 9 bits to produce a 13-bit 
physical page number, and plugs the newly produced physical page 
number back into the address. This replacement procedure is the 
sole toDic of section 2.1. 
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Address Translation 
Figure 5 

Here is a detailed presentation of the Rl-style process tables. 
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procedure is used to translate the address. * 
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Kl-Styie Paging Algorithm 
Figure 7 

Here is a detailed examination of each of the three steDs. 
Seep in mind that the ultimate goal of these steos is to oroduce a 
13-bit physical page number. 



Find Address of Appropriate Table 

One of the two process tables contains the information needed 
to translate the address. Each process table is pointed to by a 
base register. In the case of the OPT the User Base Register (UBR) 
is used, while the EPT is pointed to by the Exec Base Register 
(EBR). The EBR is loaded when' the system is started and never 
changed again. Conversely, the UBR is reloaded every time a job 
context switch takes place. 



Both the UBR and E3R hold the (13-bit) physical page number 
page containing the corresponding process table. 



of the 



* The algorithm shows the complete logical paging process. The 
hardware generally takes shortcuts in the mapping process. However, 
these shortcuts involve the hardware paae table (Section 3.2.1) and 
cache (Section 3.2.2). 
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Obtain Map Data for Specified Virtual Page 

For this step the virtual page number is used as an index into 

-the appropriate process table. The exact use of the virtual page 

number depends on whether the address is user or exec, and on what 

parr" of the virtual -address space the virtual address is in. The 

breakdown is as follows: 



2.1-2-1 All User Addresses 

The 9— bit VPN is treated, like this: 



-S- 



-*?-) 



(offset not used her 



™Q 



Ibder into process Specifies which 
table; half-word to use 

VPN. Breakdown 

Figure 8 ° 7 Q375 

The 8-bit field selects the- process table word that holds the 
nap data- Since the map information for a given virtual page 
•occupies 18-bits, each process table word contains information about 
two pages; the desired half-word is chosen bv the rightmost bit of 
the VPN- If the bit is 3 the left half word is used, while 1 
implies the right half-word. 

For example, suppose the address is user-virtual 343333. This 
can be interpreted as a request for word 333 of virtual page 343. 
The. virtual page number breaks down like this: 

343 (3)* 33 3133333 

OPT word... .. 323(8) 3 ...left half 

which means that the 18-bit map data are in OPT word 323, left half. 

Similarly, raao data for user address 277340 are in UPT word 
137, right half. 
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EXEC Addresses Between 000000 and 337777 (Exclusive of AGs) 

The virtual page number is dissected as before. However, the 
desired map data are in the EPT, not the UPT. Additionally, the 
treatment of the offset is slightly different. To select the proper 
EPT word, add 630 to the offset to produce an index into the table. 
Then select the proper half-word using the low order bit. 

For example, exec address 302741 would be maooed using the left 
half* of EPT word 601, as follows: 



002(8} = 








001(8) 
+ 600 (8) 
- 601(8) 



LH 



2.1.2.3 Exec Address Between 400000 And 777777 

These are handled exactly as user addresses are, except that 
the map data are in the EPT; the offset is not altered before use 
as an index into the process table. 

For example, exec address 403375 is mapped through EPT word 
201, right half. 

403(8)* 10000-0 011 

201(8) 1 

leads to 201 HE 



4. Exec addresses between 340000 and 377777 — add 220 to the 
offset and read the desired word from the UPT. Unlike any other 
exec addresses, this range is mapped through the UPT, not the EPT. 



. An instance of this is exec address 340040, 
through UPT word 400, left half. 



It would be mapped 



340(3)= 



11100000 

■ ^ J x 

160(8) 

+ 220 (8) 

- 400 (8) LH 
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Produce Physical Page Number of Desired Virtual Page 

The process table data found during step 2 look like this: 



•13 



w i s f c h i3 a 



Page access keys- Physical page nuaber 

a£ virtual cage 

07 QJ75 
KI Map Data 

Figure 9 
The page access keys have- these meanings: 

A 3 implies that the page is inaccessible. References to 

such a. page cause a page fault. 
P ff implies that the page is concealed, while 1 imolies the 

page is public. - 

w ff indicates that the page cannot be written- Attempts to 

write result in- s. page fault. 
S Available to the , software. TOPS-13 uses this bit in 

conjunction with VM paging. 
C a indicates that the* contents of this oace may not be 

placed in cache. " " 

h—n- Q Lf> UrSS " t* 1 * P^cal page number is the 13-bit cuantity we've 
S,r " ? * • ThlS , f ^ eld Simpl y "»!«•« the 9-bit" virtual cage"- 

So^idSw «xgia^ address, thus providing the final pfavsicll 



2.2 DECSYSTEM-20-STYLE PAGING {"KlrSTYLE") 

„- ^J he n 5F SYSTSM 7^ a . paging scheme is conceptually the same as that 

±Ln » •mSSS*?"™"" ln thit ?° tt map tile liiited " ser address space 
into a much larger memory pool, in the -10, the user soace" is 

mi™i f "? f r i narU y within P^sical core, with some pages being 
25K«7 lnste f d ^ a a page on disk. The -23, however, is more 
general: not only can user space comprise a portion of core, but it 
cart also map user pages to parts of a file on disk. More generally 

core 0ao « a h-twL„ S I* tem P«»i ta ?| fi 5 ient sharing of both file and 
core pages between process; the -12 shares only core. 
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The description in this section focuses on the behavior of the 
KL20 paging microcode. However, the narrative will occasionally 
touch on TO?S20'S use of various pointers; otherwise, it's hard to 
see how the different pointer types are used. 

Your understanding of the paging process will be helped by 
realizing the role of the paging microcode. The microcode 
completely handles requests for in-core pages; any other condition 
J*J*>£ es • action on the part of the monitor. These other cases 
include reference to a disk-resident page, attempts to use 
non-existent pages, and troubles in the paging hardware. Any of 
these situations cause the microcode to issue a page fault, which is 
a hardware trap that terminates the current operation and gives 
control to the monitor. 



As with so-called Kl-style paging, KL paging involves the 
replacement of a virtual page number with a physical cage number. 
There are two tables called the "user process table" (OPT) and "exec 
process cable" {EPT) , each one page long. These are analogous to 
the KI -style "user process table" and "exec orocess table". The 
user and exec process tables can be anywhere in Dhysical core, with 
their addresses held in the User Sase Register (UBR) and Exec Base 
Register (EBR) respectively. 

Now for a significant difference: under KL paging, the UPT and 
EPT do not hold relocation information! Rather, word 440 of each 
process table contains a pointer which, when evaluated, will lead 
the hardware to a "page map"; this page map contains the mappings 
for specific virtual pages. This scheme is reflected by the 
following diagram: 



A base 
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Simplified KL Paging 
Figure 10 
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You might wish to compare the KI process tables {figure 6} with 
the KL process tables shown here. 
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EXECUTIVE PROCESS TABLE 
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Ki- a tv?I ? lln maP entries diff « r markedly from the pointers used in 

entries ar» a*-* Z~ figure 9>. On KL20 processors the pace map 

Ivalultef to 5.;;; W ° rd !K n9 , and hold a ? ointer w ^<=h Suit be 
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This page may either be on disk or, more commonly' Vcore" ? * 

Both pointer types have the same format: 

— 36 bits . 
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Mapping data 
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Pointer Format 
Figure 12 
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° es? ^ te .. this similarity, the two pointer types function 

SiT^'d.a^i^L^-T differen * pur?oses - ss"««sSiS 

^..►2 5 m succeeding sections, as section pointers ar» 
treated m section 2.2.1 and map pointers are explained in s ect ton 



The entire mapping process can be treated as 
discrete steps, as follows: 



series of 
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„„„ ?! ?= Uowina sections examine each of these steps in somewhat 

more aetaii. 
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1. Get virtual address from the Ebox 

The virtual address arrives from the Ebox as IS address bits 
plus a signal indicating whether the address is exec or user. 
Actually, nothing happens in this step other than hardware 
handshaking (wireshaking?) ; however, this Is a good place far us to 
logically split the address into two halves. The high-order 9 bits, 
bits 13-26, are treated as the virtual page number, while the low 
order 9 bits, hits 27—35,- are used as the offset into the ca^e. The 
virtual page number supplies the pager with the information 
necessary to determine the address of the page corresponding to the 
specified virtual page. Once determined, the physical page number 
replaces the virtual page number in the original address. The 9 bit 
word index will never be changed by the mapping process. This is 
illustrated in figure 5. 

2. Find address of appropriate process table 

This is handled the same way it was on the KI. Please see 
section 2.1.1. 

3. Obtain section pointer from process table 

.The section pointer is held in location 440 of the Drocess 
table. 

4. Use section pointer to find the appropriate page map 

This operation begins at a process table. For a user address 

this is the. OPT, which is pointed to by the User Base Register 

(UBR) . For an exec address, 'the EPT is used. The EPT is pointed to 
by the Exec 3ase Register (EBR) . 

Once the process table is found, the pager reads word 440, the 
USECT (or ESECT) word. The word contains a "section pointer" which 
eventually produces the address of the page map. 

There are four different kinds of section pointers. They are 
treated in section 2.2.1. 

5. Obtain a map pointer from the page map 

The preceding step provided the address of a page map. Page 
maps contain 512 one-word entries that specify the physical address 
of a single memory page belonging to a process's virtual address 
space. Usually the page is in core, though it's sometimes on disk. 
Rarely the reference is illegal and corresponds to nothing, in which 
case the issuing process is in error. 
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6- Ose nap pointer to find desired memory 

Page map pointers (hereafter referred to as raaD pointers) have 
the same .format as section pointers, but are used somewhat 
differently. There are four types: no-access, immediate, shared, 
and indirect- They are discussed in detail in section 2~2.2. 

_ Sless- ksegia. ainS that the ultimate goal of this stem is 
either to determine the 13 bit physical page number corresponding to 
the virtual page specified in the original address, or to produce- a 
disk address that the monitor will use to bring in the needed page. 



Section Pointers 

• Keep in mind that the section pointer's purpose is to point to 
a page map. Evaluation of the pointer will eventually produce a 13 
bit physical page number. In that page is the page map. Note that 
only 13 bits are needed to find the page map: all page macs start 
on a- physical page boundary, and there are at" most (2)13 ohysical 
pages of memory. 

The treatment of this pointer varies depending on the first 3 
bits: 

if the first 3 bits- are then the pointer is. . . 

30ff no-access 

331 immediate 

313 shared 

311 indirect 



2.2.1.1 No-access section pointers. 



2 3s 

No-Access Section Pointer 
Figure 14- 



If a memory reference leads to a no-access section Dointer, 
then the reference is illegal. The result is a page fault, and 
further processing of the memory request is determined by the page 
fault handling software. 
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The capability exists primarily for the sake of gene-»m-y 
re ^ a tu v th ! re are fflap Posters in addition to section pointers! 

and that map and section pointers have the same format. As will be 
shown in section 2.2.2 there is need for no-access mao pointers. 
Since they must be included, it was easiest to provide a section 
pointer that behaves the same way. secxion 



2.2.1.2 Immediate section pointers 
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Immediate Section Pointer Format 
Figure 15 

An immediate section pointer provides the address of a 
section _s page map. The page map may be in core, in which case bits 
1^-17 or the pointer are zero. The page mao might be on disk, 
however. This case is signalled by a non-zero value in bits 12-17, 
a , c °" aitlQn ^a* causes a page fault. The monitor then uses bits 
12-35 as the disk address of the page map and reads it into core. 

Assuming that the page map is in core (as indicated by bits 
12-17=3), then the 13-bit physical page number of the page mao is 
round m bits 23-35 of the pointer. 
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Immediate Section Pointer Structure 
Figure 1'6 
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Frour the point of view of TOPS-23, immediate section pointers 
exist for much the same reason that no-access section pointers 
exist? namely, as parallels to immediate map pointers. 



2. 2. 1. 3 Shared, sactian no inters 



R 



can limn? ?«caa 



234- e 



13 
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35 



Shared Section Pointer Format 
Figure 17 



In this case,, the address of the desired oage man is not built 
into the- pointer.. Instead, the page man's address is in the Shared 
Pages Table (SPT) . The section- pointer provides an index into the 
table, and the SPT location thus specified contains the desired 
13-oit physical page, number. The pointer maid not contain the 
address of the SPT; the pager knows this independent of the pointer 
because the pager's SPT Base Register (SBR) was loaded long 
berorenand with the SPT's page address. The offset is found in bits 
18-35 of the pointer. 

In other words, the page map's address is stored' in an SPT 
word. The pager always knows where the SPT itself is, so the 
pointer only has to say which SPT word holds the data. The address 
is obtained by adding the 13-bit SBR to the offset from the section 
pointer thus: 



13-bit SBR 
+ 13-bit SPT index 



xxxxxxxxxxxxxaaaaaaaaa 

3 33 3XX.XXXXXXXXXXXXXXXX 



» 22-bit physical word address of the word that contains the pace 
mao address 



It is from this address that the pager obtains the address 
page map. 



the 



It should be stated that the SPT word consists primarily of the 
page map address, but not exclusively. The full format of the SPT 
word is: 
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Zero 
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SPT Word Format 
Figure 18 
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Shared Section Pointer Structure 
Figure 19 
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Indirect Section Painter 
Figure 20 
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Unlike either immediate- or shared section pointers, indirect 
section pointers do not lead the pager directly to the address of a 
page map. Instead, an indirect section pointer results in 
acquisition of a new section pointer, which may in turn be 
no-access, immediate, shared, or indirect. 

Once the pager has. the indirect pointer, bits 13-35 furnish an 
index into the shared" page table. In that location the pager finds 
the physical page number of a. special table call a "section table."' 
The section table may contain- as many as 512 entries, each of which 
is a new section pointer. Sits 9- through 1? of the original section 
pointer furnish an index into the section table. The indicated- 
location holds a new section pointer (no-access, shared, immediate, 
or indirect) which will be evaluated appropriately. 
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Map Pointers 

*<™ J S With secti . on Pointers, the treatment of this pointer varies 
depending on the first 3 bits: *"w vanes 

if the first 3 bits are... then the pointer is... 

830 no-access 

321 immediate 

flf shared 

_~ s ^ indirect 



2.2.2.1 No- 



access pointers 



3 2 



No- Access Map Pointer 
Figure 22 



This pointer indicates that the specified virtual oage is not 

part or the requesting process. The result is a page fault. 

No-access pointers are used to prevent a process from using 
illegal and unassigned pages. 



2.2.2.2 Immediate map pointers 



2 3 4 6 
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Immediate Map Pointer 
Figure 23 
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A given page may reside either in core or on disk. If it's in- 
core/ pointer bits 12-22 are zero r and- the page's physical number is 
held in bits 23-35 of the map pointer. These 13 bits are 
concatenated with the original 9 bit page index to provide the final 
physical address. 

If the page is on disJt instead ? timn ^i-fes 12—22 ar& aga- zero. 
This condition forces the microcode to issue a page fault, in 
response to which the monitor uses bits 12-35 as the disk address of 
the desired page. 



S«esiotr poisrsr. 




Of iser is rirnal 
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Immediate Map Pointer Strucaire 
Figure 24 



or. 0186 



Immediate map pointers 'are used for private pages, i.e. 
belonging to exactly one process. 



pages 



2.2.2.3 Shared man pointers 
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Shared Map Pointer 
Figure 25 
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Shared map pointers provide an index into the system's S?T. 
The SPT location thus specified contains the 13-bit physical page 
number of the desired virtual page- (See the description of shared 
section pointers for more detail on this.) 
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Shared Map Pointer Structure 
Figure 26 



A shared map pointer is used for a page that belongs to several 
different processes. For instance, suppose that a particular page 
contains only executable code that is part of a compiler. Several 
different processes may be compiling at any given time, so the 
various page maps will each contain a shared map pointer to the 
shared page. By doing this, the system can swap out the page and 
painlessly inform all interested processes simply by changing the 
single SPT pointer. If immediate pointers were used instead of 
shared pointers, the system would be forced to find all page maps 
using that page and change them individually. 



Another use of shared pointer 
of disk I/O. When a program uses 
is considered shared between the p 
viewed as a process. in brief 
Block (XB) is read into core, and 
map. Initially, all the XB pointe 
user maps process page 50 to corr 
results in the disk address of fil 
Then entry 50 of the user page-map 
both changed to shared pointers 
When the page is referenced, then 
entry is changed to a core address 



s arises from TOPS-20'S treatment 
a page from a disk file, that page 
rogram and the file; the file is 
, when a file is opened, its index 
its format is the same as a page 
rs are immediate. Now suppose the 
espond to file page 20. This 
e page 20 being placed in the SPT. 
and entry 20 of the file's XB are 
so both now use the same SPT word, 
the page is read in and the SPT 



The beauty of this mechanism is that TOPS-20 uses the same copy 
of the XB for every process that uses the file. For instance, 
suppose a new process decides to use our file, specifically page 20. 
The system need not read in the page again; instead," the new 



<27> 



Document on the KL Processor 
-22-Style Paging 

K*J eS ™f f age - ma ^ if 9iven the shared pointer from the XB. When 

SfJr^SS LSkJ^f'*? 1- -^"""."^ ° f the Page is automaticanj 

referenced, thanks to the information in the SPT. 



2.2.2-4- Indi rect -map pointer 



? w 



9 :j.(. (, > 



ana to sst icoseh cohkhbc 
WCE ACQBES cr A «ppbc laaue 



his 



D7 awo 



Indirect Map Pointer 
Figure 27 
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The mapping table,- in turn, contains uo to 512 entries Or* af 
evaluated * resulting address contains a new- pointer to be 
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There are several occasions for use of indirect pointers. One 

?* S! S - a >^ S 5 ne ° aSe ° f P rocess k examining a page in process B. 
Fot uhe =a!ce of argument, suppose A's page 130 is mapoed to 
correspond to B's page 36. When the mapping occurs, entry 130 in 
a s page map becomes an indirect pointer. At the same time, an SPT 
woro is loaded with the address of B's page-map; the SPT address of 
«*? "°' d " f ut into the indirect pointer! Thin bits 9-17 of the 
pointer are loaded with 36 (the page number of B's page). 
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The Ebox- (Execution box) has two basic ourposes. First, it 
?l S ±*TTZ l ? xeoitiott of Progrant. instructions* from memory. Second, 
it must, interface non-channel I/O devices to memory- 

h,* iui! !!,^^?* "*^. c ° nf juration, diagram in Figure 1 reveals that 
Si rS Jas three links, to the outside world.. These are the Sbus, 
the E/a interface, and the diagnostic bus. 

Pront^^ ^^ 3 the E ? OX fc ?, the systeffl ' s I/0 devices and the 
Front End. The Ebus carries all control information from the CPU to 
the output devices. Additionally, the Ebus transmits data to those 
devices that_ do not have a data-channel. Similarly, all devices 
send control information back to the- CPU through the Ebus, and 
non-channel devices send data via the same route. 

The E/M interface connects the Ebox to the Mbox. The 

information carried across this set of links is not normally of 
interest to the programmer, but typical signals include the 22 bits 
Ja.1L addre « de fired by. the Ebox, a signal indicating whether that 

« SEE i \ VX IT L ° r f^ 1 " 1 ' and signals describing the nature of 
an Mbox-detected page fault. 

Finally, the diagnostic bus connects the Ebus to the console 
front end. The controlling PDP-11 uses this bus to bootstrap the KL 
and to gather information about the KL's health (or lack of it) 
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The Ebox consists of Emitter-Coupled Logic fECLl Thi« 

system^* followin * c0 *P°nents are found in the Ebox of any KL-based 

* arithmetic logic 

* accumulator blocks 

*~ microcode and microprocessor 

* Program Counter (PC) 

* meters 

n „ ... l h * fol l°«ing sections of this chapter provide further details 
on the accumulator blocks, the microcode, the PC, and the KL meters! 



Accumulator Blocks 

The accumulator blocks are variously call**? ar hi „,-■,«,■ « „ 
Sir S'U^lfc.^ixSJr-- 05 ^ - -"°-xt wf ^ 

Sr-"^ 13 -r^ n - -"'— SS.^ 1 ^^" a^fnt.r^ 



Block a ill Si Qve f? ead the KI w ^ given four blocks of 16 ACs. 
oroSraml S!i *% maneatl ? ^signed to any exec-mode program, but user 
programs could be given any of the four blocks. The TOPS-10 
convention gave block 1 to the current user and left blocks 2 and 3 
unused. (The unused blocks could, however, be used by realtime 

foufd^fke alvanf a ^^ Wanted t0 US * bl ° C * "lorlnstance? 1 ^ 
could take acvantage of its user I/O privileges to issue a datao par 
instruction to switch to the desired block.) 

softwar/^an 33 ^/" bUUt 7^V ight s *" of 16 accumulators. KL 
!hfJ! „.". ! ! eC USe multi Pl« AC blocks because the KL, unlike 
\l L Permits both user- and exec-mode programs to use any of the 
eight blocks. TOPS-10 assigns the blocks as follows: 

most monitor operations (cf. 2 and 3) 

1 current user program 

2 scanner interrupt-level code 

3 disk interrupt-level code 

4-6 unused, available for realtime 
7 reserved for use by the microcode 
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The TOPS-23 assignments are: 

6 exec-mode programs 

1 user-mode or previous context exec ACs 

2-2 unused 

S KL paging 

7 reserved for use by -the microcode 



Tou will periodically see references in DEC documentation to 
"previous context ACs" and "current context ACs." This distinction 
relates to the PXCT (Previous context eXeCuTe) hardware- instruction, 
which is similar in concept to the exec-mode XCT of the Kl. PXCT is 
described in the Hardware Reference Manual. 



Microcode 

The KL's operation is governed by microcode. While there are 
several microstores in various parts of the machine, this discussion 
centers on the CHAM. (Control RAM) and the DRAM (Dispatch RAM) . 
These two RAMs (Random Access Memories) form the "instruction 
execution, logic. They are writeabls semiconductor memories that are 
loaded by the console front end processor when the system- is brought 
up. 

The CRAM is 2343 words long, with each word 34 bits wide. It 
contains the microprogram that implements the DECsystem-13 or -23 
instruction set, priority interrupts, etc. To give you an idea of 
the things controlled by the CRAM program, here is a list of some of 
the program modules of the microcode: 

•• Startup and stop handler — called, at. the end of each instruction 

to loot for new Pis, etc~ 
•■ Effective address manager — computes an instruction's effective 

address using the instruction's I, X, and 1 fields.. (It does not, 

however, compute the corresponding physical address; 

virtual-to-physical mapping is done by the Mbox) . 

• Executor routine - contains the separate subroutines that 
implement specific -13 or -23 instructions (e.g. half-word moves 
and stack manipulation) . 

• Priority interrupt handler - checks if a PT has been requested. 
It is called from various points in EA calculation, and* during 
some long instructions such as BLT (thus preventing lengthy 
operations from seriously delaying interrupt handling) . 

• Page fault handler - called when the Mbox can't resolve a 
virtual-to-physical address translation for some reason (e.g. the 
access-allowed bit being 3 for a virtual page) . 

• Input - output handler - generates any Ebus dialogue required by 
I/O instructions (DATAx, CONx) . 
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r* iM hS Dls ? atch R^ (DRAM) is 512 words long by 24 bits wide. The 
CRAM program uses the DRAM to decide how to process a given -13 or 
-20 instruction by obtaining, from the DRAM, the address of the 
specific CRAM routine that handles the instruction. For instance, 
suppose the current instruction is a MOVEI, for which the opcode is 
ZBl. The CRAM would first compute the effective address of the 
instruction {regardless of the fact that ' it is -a HOVEI; the EA is 
SlI^S^J. S 5! P in P"«ssing any instruction). Then the CRAM 
obt|ins the address of the MOVEI subprogram from DRAM word 231, and 
ju-i-s ~c it. Similarly, if the instruction was a MOVE (opcode 208), 
lu e SnfEf tCh address w *± d come from DRAM word 200. In other words, 
the DRAM's entries are indexed by instruction opcode. 



PC-Word 



, TJe KL PC word format is identical to that of the KI It's 
described m the Hardware Reference Manual. 



KL Clocks 



The KL processor contains four programmable clocks. They are 



the 

•• interval timer 

•- time base 

•• accounting meters 

• performance analysis counter 



*.t» 2£ clo< : ks are controlled by use of the three I/O device-codes 
tim, MTR, and PAG. All hardware clock logic is ECL and is contained 
oy the KL mainframe with the Ebox and Mbox. 

The following presents a more detailed view of each of the KL 
clocks. 

Interval Timer 

The interval timer is similar in function to the DK10 clock. 
The timer can, at the programmer's option, interruot on any desired 
PI level. The resolution of the clock is 10 microseconds, and the 
interval is programmable between 13 microseconds and 43.95 
milliseconds. 

The interval timer comprises a 12-bit counter and a period 
register. The period register is loaded by program control and 
reflects the desired frequency of interrupt. As mentioned earlier, 
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the ■ frequency can range from 10 microseconds to 40.95 milliseconds 
in increments of 10 microseconds. If the program sets the oeriod 
counter to four, the timer will go off every four increments*, i.e. 
every 40 microseconds. When the timer goes off it causes a vectored 
priority interrupt to EPT word 514. The interrupt occurs on the 
clock's program-assigned PI level. 

The- JW'tar g j ^ "tra*™ - " ■»« k*ii*»«i i »j -_j .i_j j-j^a i -,«. - - * _.- 

— — _ — * •— — « *-• >— ** — t «j.^a- mm jjibutwatw hy uSe- ot tne 

instructions CCNO UK and CONI Tin respectively. 
Time Base 



The time base is used to measure long-term elaosed time with 
one microsecond resolution. It offers accuracy of +-.005%, which 
amounts to a maximum of five seconds drift over 24 hours. 

The time base is a. 63-bit clock. rts length oermits it to 
count intervals- of 9140 years, after which it* unfortunately 
overflows: The time base is incremented. every microsecond. 
Theoretically,, the 60-bit count could be maintained Li the BPT and 
incremented there every microsecond. The. increment isn't done this 
way; though r because this- would result in a- blizzard of memorv 
references to the E2T. To hold down the overhead, the time base's 
count is. incremented in a 16-bit register contained in the Ebox. 
Only when the count carries into the- high order bit of this, register 
is the count added to the 60-bit total in the EPT, after which, the 
16-bit register is cleared. The disadvantage of this technicue is 
that the current time is not immediately available by looking "at the 
EPT. For that reason, the system provides an instruction {"DATA! 
TIM") that produces the current time. The 60-bit cuantitv is held 
in EPMP words 510 and 511. 

The time base is controlled- and interrogated by the "T5ATAI 
TIM"', "CONO MTH*, "HDTIME" (read time-base doubleword) , and "CONI 
MTS* instructions. 
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Accounting Meters 



The accounting meters are, unsurprisingly, - intended for job 
accounting. They consist of a Ebox busy meter and a memory cycle 
meter. As they can be programmed to shut off during PI processing, 
they offer an extremely reproducible way of billing users and 
comparing program performance^ ^^ 

-These are two 6B-bit meters.' One of' these, the Ebox busy 
meter, increments while the Ebox is executing microcode. The other 
meter, the Mbox cycle meter, counts the number of times ■ the Ebox 
references memory through the Mbox. 

The accounting meters are similar to the time base in that the 

la °* con : ? ains two 16-bit registers (one for each meter), and the 

5 1 B -! lu€S rssids in memory. The Ebox busy meter is keot in UPT 

anS ?«? x nd J 05 ' While the Mbox c ? cls meter occupies DPT* words 506 
J7S- Zl- m.tiixs connection, it is interesting to note that the 
^ime oase is Kept in the SPT (as it is a system-wide count) , while 
the accounting meters are put in the DPT (since they contain 
information about a particular process) . 

The relevant hardware instructions are "CONO MTR" , "CON"!" MTR" , 

-SSn ,*?»", . (al f° called "RDEACT"), "BLXIMTR", ( " RDMACT" ) , and 

DATAO PAG" (wnich causes the meters to be saved on a context 

switch) . 
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Performance Analysis Counter 



»„„4.?* performance analysis counter is a built-in hardware 
"?2i? ,* fc " desianed to gather information that would be 
difficult or impossible to get using software probes. The 
performance analysis counter permits sophisticated svstem 
measurements to be- made, it offers- advantaaes ' not a^aAiafe'!- - -i'th 
sottware monitoring,. For instance, it "does not interfere with 
system operation. Another feature is the ability to identify events 
nappening at the sub-microsecond level. 

~-a *££?' "If 1 ^ is a 5a-bit counter that is maintained in EPT 
words 512 and 513. 

Ose of the counter, being rather complex, is not intended for 
tne inexperienced. For that reason this document does not describe 
tne counter m detail, but you might wish to note that the counter 
can measure combinations of the following conditions: 

• Qser mode 

* PI level active 

* cache miss: 

♦ cache writeback 

• cache sweep 

•» Ebcx-Mbox request 

* microprogram event 

♦ channel busy 

» SCL probe input 

The- counter is controlled by "BLXO TIM* (or "WHPAE") , and "BLXI 
TIM"^ (or "RDPEKF") . 

Complete details on the counter's use can be found in the 
hardware document entitled Meters-Onit Description , EX-MTa-cro-301. 



The following two sections describe TOPS-22 and TOPS-10 meter 
usage conventions. 



TOPS-10 Meter Usage 

The following description applies to the 6.03 monitor. 

Interval timer Provides the jiffy clock tick (every 63th of a 

second in 63 he countries, every 53th of a 
second in 53 hz countries) . 

Hecords time of day, and optionally, job 
accounting if feature test switch FTEMRT is zero. 



Time base 
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Accounting meters Job accounting if feature test switch FTEMRT is 

non-zero. 
Performance meter Accessible using the PERF. monitor call. 



TOPS>20 Meter Usage 

As ox the Release 2 monitor, the following usage prevailed. 

Interval timer Interrupts every millisecond. The interrupt 

handler maintains a count of the number of 
interrupts, and upon occurrence of the 2oth tick 
control is given to the monitor overhead cycle. 

Time base Used for time-of-day maintenance and job 

accounting. 

Accounting meters Not used 

Performance meters not used 
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The job. of the Moo* is to connect devices to memory. AIL 
KLr-based processors require the Ebox to access memory through the- 
Mbox. Additionally, some systems (the 1390, -2040, and 2353) replace 
the old DF13/13C data-channels with internal data-channels that are 
connected to the Mbox instead of a memory port. 

The external Mbox connections shown in Figure 1 are the E/M 
interface, the Sbus, the Cbus (on some systems) , and the diagnostic 
bus. The S/M interface was described in Section 3.1. The Sbus 
connects the Mbox to the memory subsystem. The Cbus links the Mbox 
to as many as 3 RH20 controllers and serves as a data-channel. The 
diagnostic bus permits the Front End to control Mbox operation and 
determine Mbox status. 

The Mbox may contain the following components, depending upon 
the system; 

«► hardware page table 

• user base register 

• exec base register 

• cache memory 

• Cbus. interface (internal channels) 

Not all of these are present in any given system, as shown in this 
table. Any component not mentioned is present in all. 
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™f_-__ 10 83 1390 2340 2050 

Cache Y v N y 

Internal channels N Y* Y Y 

* but may also contain external channels 



r2!?t»?SS°:?;?^S: ^^j° e l in th ?- Production to Chapter 

irminology. 



J^^^^^isJsjjrLs -Liars ^%,su „ 



in tne form of virtual addresses (either user or exec) which must h» 
may use the cache, thus involving the mIJI? eadS ' however ' 



Hardware Page Table 



a **r S « F? °mf the Ebox re< 5 ues -s the contents of a oarticular virtual 

or^o«\ a H? e ° retl ^ y ' c tile mox must r ^ d ^ section "inter from a 
process table probably find an SPT wor.d?\nd read a page-map S??y? 

^f; „ lz tni ! Procedure were used, the system would have to make 
ScLL e ing y acc e e f srri:es: heneVer ° ne WaS ^"^ ^ 5 dr..ticSg 

Built*!!? SmTiS 1 ; 1 *!* ttiJ by haVing a 32 ' wo ^ associative memory. 
?Si ?•> semiconductor memory and internal to the CPU, it could hold 

Tnl CpS° S nnIv C hL tly . ua « s < ,W«-«ap entries for rapid future access? 

was not m Hi 1 l e&d the in " COre P a 9e-map if the desired entry 
was not m the associative memory. 

The KL has a similar but improved technique. Instead of * n 
ff ri at i VS T"^ the pager Stains a hardware page taHe, whicS 

KL23s e ? 6XeC " and user -P r o«ss tables (page maps for 

uspd X « bS deli 3 h ted to tell you that the virtual page number is 
wou!d be n e if d ?he "HL^ h */ d ™* Page-table, for instant, it 
haSwarl ^e^able ■SPSS?. 'SSfoSS.tS?? if*!, noT £S* ln 
at a^^veVtf^V* if the *■*•"**" E" i- Series "since 

lage^LlpIngsIf?^ ISr uslr^ff oT^c? ? " ^ " "" * lrtMl 

<39> ** Preceding not done in 
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_ The most direct possible solution would have been to use the 
virtual page number as an index into the page-table, and simoly 

llllltt a Sta - US bit f0r eacil entr * taat indicated Aether III 
mapping was user or exec. If that were done, however, a new problem 
would JPPear. The problem is that in any given process {whether 
!5™ r- f J ' ■ t ? ere _ are usually many references to the first few 
?f?f^* If _3 n ?L s ^ E?Jla scneme just described were used., it «ould mean 

£Za flflfl r= ^ csnc ? fi *2 user **** a00 would b « bitten- in page-table 
wore aaa. Then, if the user process issued a monitor call, it would 
oe nigniy likely that a memory reference would shortly be made to 

wrfLfn 9 *^!!* *? at W °4L CauSe the ma PP in 9 eor exec page 330 to be 
written over the user 000 mapping. Then when control was returned 
to tne user, the mapping would have to be recomputed and stored back 
into the page-table (thus wiping out exec 000 's mapping again), 
hashing?- page - table refUi activity is a cfipolenl of 

** te 5 avoid . thrashing, the page-table is structured in such a way 
H£ mapping for exec page 000 is in a different place from user- 
page 000.. The procedure used is this. When the Mbox looks, for an 
entry in the hardware page-table, it picks up the 9-bit virtual pag» 
number from the virtual address. Next, it flips bit 19 of the cage 
number (the second from the left as we view it) if, and only if / the 
virtual rererence was from user space. The resulting 9-bit number 
is used as an index into the page-table* 

For example, suppose the Mbox desires the mancinc for exec 
virtual, address 002074. The- virtual page number" is 002 (octal) r 
which is 000000013 in binary.. Since the address is exec, bit 19 is 

not changed. Therefore the map data are either in word 002 of the 
page-table, or the data are not in the cage-table at all. m the 
latter case, the Mbox would have to determine the mac data using the 
process tables and then load it into the page- table." 

-„-, * ltar 5 ativel y' suppose the Mbox needs user-virtual address 
302362. The VPN is. 002 (octal) , or 300000310 (binary) . This time 
the address is user, so bit 19 gets flipped, from to 1, which leads 
to a modified index of 010000010, or 202. In this case, then, the 
desired data are in page- table word 232, or else not in the 
page-table at all. 

Please note that the virtual page numbers in these two examples 
were the same (302). But because the addresses used were from 
different address spaces, the desired page-table entries were 
different. 

Analysis of this scheme would reveal that the format of the 

XL's page- table is this: 
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512 
entries 



Sxec pages 000-177 

or 
Oser pages 200-377 



> 12 S entries 



Exec pages 200-377 

or 
User pages 000-177 



f 123 entries 



iXec pages 400-577 

or 
CTser pages 60Q-777 



>128 entries 



Exec pages 600-777 

or 
User pages 400-577 



> 128 entries 



Cache 



KL Page Table 
Figure 29 
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The purpose of cache memory is to speed ud instmrHnn 
averaae^memo^ ?, -bstantially deducing thffime^eed'ef lor'tSe 
hIJh ILJ Y - re 5 erence - This is accomplished" by placing a 
high-speed semiconductor memory (the "cache") inside the Mbox The 

accessed a^wo^I '2 S 7°^ 5™ C ° re ' ^-evJr the progra^ 
n^nof^n^! held J ln cache ' the request is satisfied in 160 
nanoseconds, as opposed to a microsecond 

reference. 



or more for 



core 



ai fl «r?5S«. SUCCe ? s ° f ^ 1S . schem e depends on the quality of the 
2i!!E 1 ? hn USed t0 decide which 2K from core is put into cache 
*mch locations are -cached changes constantly with 



demands, but the algorithm is based on the assumptloTtSlt I 



system 
memory 
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references tend to be somewhat localized. As an example, consider a 
typical program's structure. Usually, the flow of control is linear 
within a narrow scope; if an instruction has just been executed 
from location N, there is a good chance that there will soon be an 
instruction executed from location S+l. 



■■" j-*k «W V J*W«^ 



^1_ 



7~> . iniTTT igj.gTi a*»& *oc««i guL quxxe ^ w .1 '- - xae 

"•hit rate* for the ZL's cache memory is better than 93%. In other 
words, any given memory reference has nine- chances in ten of being 
satisfied from cache,, thus saving a great deal of time- The 
algorithm used in the KL was developed at Stanford University using 
extensive modelling. 



-Note- 



The cache contents are addressed 
by physical addresses. Thus cache 
comes into play only after a virtual 
address; has been converted to 
physicals 



As mentioned earlier, the cache can hold up to 2348 words from 
cor-a memory > The cache is arranged in four pages, as follows: 




Page 1 
S12 wds 



Page 2 
512 wds. 



Page 3 
512 wds 



Cache Pages 
Figure 30 
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For simplicity, let us consider one of these four pages and the 
format of the data stored within it. The structure to be described 
is identical for each of the four pages of cache. 

Each cache page has a directory associated with it. A 

directory consists of 128 entries, each entry being 13 bits wide. A 

single directory entry contains information concerning four words of 
data within the cache page. 
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This gives rise to a structure that looks like 
convenience the diagram uses decimal arithmetic.) 



this 



(For 




Directory 

entry 
127 



Cache 


word 





Cache 


word 
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Cache 


word 
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Cache 


word 


3 




Cache 


word 
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Cache 


word 
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Cache 


word 
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word 
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Cache 
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Structure of the Cache Page 
Figure 31 
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Mll j f=>ur-«ord cell described by a single directory entry is 
rnllll „. \ h Q - u f vo * d . • Th * ^-bit directory entry for a quadword 
S™S™?S hS P h r s i Ml Da< ? e noab« °f the page in core from which the 
quadword came. In turn, the position of a word within a cache oage 
is always the same as the position of the word in its original page 
or core. 

^„ iS t,s consider a specific, somewhat simplified examole. Assume 

III* J t ^ m 2° en \ fc we ' have only one Da S e °f =ache and its 

associated directory, rather than the four that are really provided 
m the hardware. Suppose that the Ebox has requested the contents 
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of Physical address 14737032, a 22-bit address. The Mbox has first 
to determine if it must read core location 14797332, or better, if 
«hJ^i2? 22 iS al ? aad y in cache. The first step is to split the 

»?I L ]-bit index into the pace (332). In other words, we 

are concerned with the 302nd word of physical page 14707. If this 
word is already cached- tim. tn. n>i v Zil~~ *>? ll.tl LI ft "^ 
single cache page would be in word 302 {because "the position of a 
TJA ^^^ J 1 "^W" i* «l«y» the same as the JSiSE ol the 
ll2i t "^i2« original page"). The Mbox must therefore examine the 
"J i5 directory entry corresponding to the 302nd word of the page 
and compare it to the desired physical page number of 14737. If thl 
*^ e S° r £ entry hclds 147 * 7 ' then the a 32n<* ciehi page word il 

hSJrL^a^h 1 ?'^ ^i"? f ° C * If **• coo ? arison Z*^' then we 
have no choice but to read physical core. 

<» -«i t '2 i ! h ^ be worthwhile to examine the significance of quadwords 
m some detail. Since there is exactly one directory entry for each 
quadword, it follows that all four words in the quadword LIS come 

f»2! h »?*** S** 81 ?* 1 Da< ? e - Moreover, keep in mind that a word 

must haye the same position in the cache page that it had in the 
physical core page. These facts imply that the four words in a 
single quadword are physically contiguous in core as well as in 
cache. 

The example just traced was: simplified bv the omission of 
*?E?*~£ ourt £ s of the caci *e pages. In a real system with four pages 
UK) of cache, a given physical word might actually reside in anv 
one of tne four pages of cache. Let us return to our examole for a 
moment. If we need' physical address 14707302, we have to " ke-p in 
?S2l th ^ ?^ sical - 9*3* number (14737) and the index into that oage 
(302).. If the word resides in any of the four cache pages, we icnow 
it has to be in. word 302 of whichever page holds it,, just as we knew 
earlier that it had to be in word 002 of the single cache oage. 
T 5 e "J" e ' the Mbox has to compare the desired physical oage number 
of 14737 to the contents of four directory entries, one for word 332 
2f *!f , of the four cache P aaes - " a match is found for any one, 
then whe data is taken from che proper page. Otherwise, ohysical 
core must be- read. * 

The system just described should serve to introduce you to the 
XL implementation of cache. There are several further 

cnaracteristics that deserve mention-. 

* 5 cache is SSI a write-through cache. If the Ebox instructs the 
Mbox to write a given location, the location is modified only in 
cache. The corresponding physical location will be undated only 
when the monitor instructs the Mbox to sweep cache, or when a 
quadword must be emptied to make room for new data. This fact 
has considerable importance for multi-processor KL systems. 

* KL cache is organized to handle physical addresses. The cache 
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scheme used on some other large systems, however, is oriented to 
virtual addresses. Stanford's modelling demonstrated that the 
use of virtual addresses in the cache algorithm is less efficient 
than use or physical addresses. 

* The hardware's use of the cache is dependent upon the Mbox 
microcode. This microcode is normally set up Ho support use of 
all four cache oaaes and four-ua^ in+ariajoin/T t* <*,»,- j-_j 
however, some or all of the cache can be turned off. This option 
is exercised when the front end is initializing the -10 at system 
startup. J 

There are three different operations to which the monitor can 
subject the cache: invalidation, validation, and unloading. Any of 
these operations can be performed on the entire cache, or on entries 

belonging to a single page. 

vr^t-ll ^T?^ 3 ^ 3 1 * Cat i° n u i f simply to clear its valid and 
written bits, ail of which has the effect of simoly emptying the 
location. Validation of a location means that if an entry has been 
written since it was brought in from memory, then the modified 
contents jnust be written back into physical core. This situation 
arises from the fact that the cache is not a write-through cache. 
Finally, the unloading of a location first recuires the Mbox to 
validate the location, then to invalidate.* In other words, the 
location is first written into core if it has been chanced since 
being loaded, then the location is emptied. 



Core Status Table 



( KL-style only ) 



^ The Core Status Table (CST) is indexed by physical page number 
and contains one word for each physical core page. Each word has 
the following format: 

Page- modified 
bit" 




CST Data Word 
Figure 32 



The microcode references the CST only when the pager has to get 
data from memory (as opposed to finding it in cache) . When this 
happens, the CST entry for the referenced page is checked. If age 
stamp bits 0-5 are non-zero, the reference proceeds. However, if 
the age stamp is zero, a page fault occurs. 
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Here's why. Periodically, the monitor may decide to housekeep 
system storage, which results in various process pages being placed 
on the system free list. Theoretically, the monitor could write 
these pages on disk and chang e the pointer for that oage to reflect 
"7" — -—a-. -«*- .» «ww yuQu, uuuuyaj ueie's ao guarantee tnat 
f 11 toe pages just released will immediately be given away again. 
So £T a page is not reassigned r and the last owner of the page tries 
to use it again, the monitor would have to read the page back from 
disk, even though it's still in core! 

The CST gets around this problem. When a page is added to the 
free list, the pointer to that page is left intact. The monitor 
only zeroes the age stamp in that page's CST, and purges the page's 
data from cache. After this, two situations can arise. Pirst, 
suppose the page is assigned to another orocess. At that time, the 
page's contents are written to disk (if necessary), the old oointer 
is changed, and the mapping proceeds. No time is lost over the 
scheme described earlier; things just happen later. However, 
suppose the page isn't reassigned, and the original owner 'tries to 
use the page again. The pager won't find the desired word in cache, 
because cache was. flushed when the page was added to the free list. 
Therefore, the pager checks, the CST, finds: the bits zero, and 
generates a page fault- The- monitor then takes over, determines 
what's- happened, and gives the page back to the process bv simply 
stamping bits 3-3 of the CST entry. Unnecessary writes and reads 
are avoided - 

CST entries also contain a Process- Use Register (PUP.) and a 
"page modified" bit- The PUR reflects the way a page is being 
shared by different processes. The page modified bit is set when 
page data is changed. When a page must he swapped out, it needs to 
be written only if it's been changed; otherwise, the original cony 
on disk is still valid.. At page-out time, the monitor decides on 
the need for swap-out by checking bit 25 of the CST' entry for the 
page under consideration- 



Internal Channels (Cbus) 

The Cbus, and associated "internal channels", r enlace the older 
DF13/DF13C/DAS33 data-channels. Cbus features * include such 

advantages as increased reliability and lower cost. From a system 
programmer's point of view, there are two principal differences. 

First, the Cbus permits up to eight RH20 controllers to attach 

to the Mbox. Each RH23 effectively has its own data-channel in the 
form of a Cbus connection. And since each controller has its own 
channel, they can all be transferring simultaneously. In older 
configurations, to get the same capability would require* that each 
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controller have its own DF13-style data-channel, which leads to 
considerable expense. 

Second, the Mbox provides a sixteen-word buffer for each 

f^ff^i^ f ontroller on the Cbus. This buffer provides protection 
against data overruns. 

_ A look at page 48 reveals that the Cbus departs significantly 
from external cnannels in that the Cbus communicates solely with the 
Mbox, which in turn handles all transfers to or from the memory 
subsystem. External channels had direct connections to memory 
ports. Although the Mbox might seem to be a bottleneck in 
Cbus-equipped systems, it has been determined by testing that the 
Cbus runs no greater risk of overrun than external channels did. 

Unfortunately, not all channel devices can be attached to the 
2!?fV ? otabla exceptions include such DECsystem-13 devices as the 
RH13 disk controller and the DX13 controller for TU70 taoe. Systems 
that have these devices are equipped with internal channels" where 
possible, and external channels when needed. 

A fringe benefit of channeling data through the Cbus is that 
channel reads can get data from cache. This is impossible using 
external channels, since the data path avoids the Mbox. The 
advantage partially extends to output; although the Cbus cannot 
write cache, it does cause selective invalidation of cache words 
that have been changed. (Input is not directed to cache because 
cache would tend to be flooded in an I/O environment.) Writes using 
external channels required a cache sweep following the transfer. 
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MEMORY SUBSYSTEM 




W 9353 



The implementation of core memory varies considerably between 
the DECsystem-13 computers and the DECSYSTEM-23s . The -20 line 
features "internal memory"', while the -13 line 
are external to the CPU. 



uses memories that 



Memory subsystem 



1388 1090 2040 2350 



Internal memory 

DMA 

External channels 
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* Sometimes — see Section 3.2-4 
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External Memories 



The external memories in use with 
MH13s. The MG13 is normally used 
systems are being shipped with MHl3s. 
internally, as follows: 



KL systems are the MG10s and 
with 1380 systems, while 1390 
The two memories are .similar 



Bank 
(128KJ 


CON 
TROL 

LSR 





Port 


Port 1 


Port 2 


Port 3 


Bank 1 
C12S) 


CON 
TROL 

LZR 
1 


Port 4 


PorH 5 


Port 6 


Port 7 



An MH10 Memory 
Rgure 33 



A single MG10 or MH10 consists of two banks 
bank having its own controller. Thus any given 
be serviced by exactly one controller. " Since 
handle at most one request at a time, simultaneo 
locations within a single bank will result in one 
waiting until the other is complete. On the 
controllers operate completely independently 
Therefore, simultaneous requests can be made and 
the two locations needed are in different banks. 



of 64/L2Skwo r ds, each 
memory location can 
a controller can 
us requests for two 
of the requests 
other hand, the two 
of one another, 
serviced as long as 



Note also that the memory has eight ports. These are priority 
ordered with ports and 1 sharing highest priority, ports 2 and 3 
sharing second priority, and ports 4 through 7 having the lowest. A 
request coming in on any port can be sent to whichever controller is 

required. 



The following diagram represents 
configuration. 



a typical 1090 memory 
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Typical 1 090 Memory Configuration 
Figure 34* 
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* n M A° ta 2** fc u e US * 0f externaI - memories dictates the oresence of 
Kbuset* „S^ b H X , lntar ? aces ^e Sbus (one word wide) to S! loSr 
I52f!!l w !?? ° f whiC V S aIso 0RS word wide - In this way the system 
Sf tl SI! S . \ fo »n«rd data path into the DMA and a onl-worS 

data path between the DMA and the Mbox. The four Kbuses are 
important to the correct operation of interleave, which iJ 
normally four-way. This is best illustrated by means ol an example 
Consider the following sequence of events: example. 

1. The Ebox asks the Mbox for the contents of a memory 
location. For the sake of example, supoose that the 
location needed is physical address 1730. 

2. The Mbox attempts to satisfy the request by looking h„ 
cache. Frequently the desired data will already be in 
cache, in which case no reference to physical core need be 
made. Suppose, however, that location 1709 is not in anv 
or the four cache pages. This leads to Steo 3 
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3. Now the Mbox will read location 1780 from physical core 
into cache. In fact, not only will location 1700 be 
cached, but so will the other three words in the quadword. 
Thus the Mbox needs to read words 1730, 1701, 1702, and 
1703. To do this, the Mbox requests the DMA (Direct Memory 

— Access} to read the desired four words and pass them across 
the Sbus to the Mbox for caching, 

4. The DMA proceeds to issue four simultaneous requests, one 
for each of its four Kbuses. The memories were configured 
for four-way interleaving when the system 'was first brought 
up, which guarantees that word 1700 will reside in memory 
bank 0, word 1731 will be in memory bank 1, 1702 will be 
in memory 1 bank 0, and 1703 will be in memory 1 bank 1. 
Since no two of these words are in the same bank, the four 
requests will be handled by four different controllers, in 
parallel. Note that the first request issued, and thus the 
first to be honored, is for the address originally needed. 
In this way, further processing can take place while the 
rest of the quadword is being filled. 

5. As the data is sent back to the DMA from the memories, the 
DMA passes the information along to the Mbox. Thus the 
quadword is filled in cache, and ultimately the original 
Sbox request is satisfied. 

6. This concludes our. examination of the Ebox request. 
However, it is worth noting that in many cases the Ebox 
will shortly request the word adjacent to the original 
word, in this case 1731. If that happens, the Mbox will 
find that 1701 is in cache, thus obviating most of the work 
outlined above with considerable saving of time. 

It is apparent from this example that four-way interleaving on 
a KL system is powerfully tied to the concept of cache quadwords. 
It is for this reason that system throughput suffers on caching 
systems whose memories are configured for either two-way or no-way 
interleaving. 

As a final note, it should be mentioned that memory 
configuration depends on the program in the Mbox microcode. The 
choice of configuration is made when the system is brought up. 
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Internal Memories 

DECSYSTEM-20 machines feature internal memories. These offer 
improved reliability and lower cost than external memories. So far, 
the only internal memories offered have been the MA23 and the MB20- 

X ClOSe look afr aw intar nal gunnra ---_---'' - J.4.-.J. ^a._ j»» ->. 

Similar^ to the MG13. Like the MG10, an MA23 has two memory banks, 
eaci with its own. controller. The two controllers operate 
independently of each other, thus providing the ability to overlap 
within a single unit of MA20 memory. However, the MA23 contains no 
ports like those of the MG10. There is no need for them, as -20 
systems support no devices having external data-channels, so all 
memory requests are handled by the Mbox. These requests, in turn, 
are fed back and forth through the Sbus. 3v the same ■ token, the 
MB 20 is analogous, to the ME10. 



Neither is there a DMA on -23 system. Instead, 
memory controllers communicate directly with the Mbox. 



the various 



— — ■ — ——-Channels 

(Channel I/O moves through 
Cbus and Mbox, thence to Sbus) 



Sbus 



MA20/M323 




12 8X 






MA20/MB20 




128K 







Figure 35 
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The principal ingredient of the front-end subsystem is the 
PDP-11 computer. Like any PDP-11, it is connected to its devices by 
its UNI3US as shown in Figure 36. 

The DTE is the interface between the Front End and the KL CPU. 
The primary purpose of the DTE is to permit the Front End to control 
and monitor the operation of the KL CPU. The KL can suoport uo to 
four DTEs. 






The DTE provides the following functions: 
examine or deposit of words in specified areas of KL memory 
high speed, simultaneous, two-way data transfer (so-called 
transfers") 
doorbell interruDts: 



'byte 



the -11 can interrupt the KL and vice versa 
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Generalized Front-End Subsystem 
Rgure 36 
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Additionally, a specially enabled DTE can: 
•■ examine or deposit words, into any area of KL memory regardless of 
protection; 

• control and obtain status from from the KL CPU; 

• let the -11 bootstrap the KL; 

• let the KL bootstrap the -11. 

The DTE has two operating modes: restricted and privileged. 

This is determined by the setting of a manual switch on the DTE. 

PDP-11 attached to a restricted DT2 can perform the first set of 

functions listed on Page S2, while a privileged DTE/-11 pair can do 

everything listed on Page 52- Normally only the master -11 (usually 

attached to DTE3) is privileged. 
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There are two different ways the DTE lets an -11 communicate 
with KL memory. First, the -11 can use the examine/deposit feature, 
which permits the -11 to read or write a single KL word. The other 
way is with byte transfers, in which the DTE is responsible for 
transferring a siring of data to or from KL memory without tying uo 
either the KL or -II C3D- 

-- Examines and deposits can always be made to any address within 
windows defined in KL memory. The windows are specified by the-KL's 
exec process table. There are two windows, known as the to-KL area 
and the to-11 area. These differ in their availability to the two 
processors, as follows: 

Can KL write? Can -11 write? 



to-KL area Y. y 

to-11 area Y N 



A. restricted front end cannot examine or deposit outside of the 
windows. This permits the KL to protect itself* from a wayward -11. 
However, a privileged front end can examine and deposit anywhere in 
KL memory, without regard for protection. 

The other transfer mechanism is the byte transfer. It has the 
following characteristics: 
*■ Permits transfers to or from anywhere in KL memory; 

• Byte size can be eight or sixteen bits, at the programmer's 
option; 

• Supports simultaneous to-11 and to-KL transfers. 

. Once the transfer has been initiated, the DTE handles it 
without further intervention from either CPU at the program level. 
In other words, the KL monitor will not be interrupted until the 
transfer is complete. The DTE can recognize the end of the transfer 
either by the transfer of a null byte or by expiration of a byte 
counter. Transfer completion results in an interrupt on the 
assigned PI level. 

It is important to understand how the byte transfer is being 
handled internally. It was stated in the preceding paragraph that 
the KL monitor does not see an interrupt from the DTE until the 
transfer is complete. This is true, but please note that the Ebox 
is internally interrupted by the DTE for every byte passed across 
the DTE. The interrupt comes through on PI level 0, which does not 
cause an interrupt that is visible to the operating system. The 
effect of the level interrupt is to force the Ebox to move a byte 
between the DTE and the Mbox. Thus, every byte transferred through 
the DTE results in a small amount of CPO overhead, but does not 
require monitor action. 
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Byte transfers are not limited to the windows. This does not 
represent a security problem since in the case of a to-KL byte 
transfer the KL, and not the PDP-11, specifies the byte pointer and 
thus the destination address in KL memory. 

*>,»* ^S^^ff^ 5 0f transfer^require use of the Ebox, which iaoJ ias 

— - 3 -. -? s ' microcode must, fee i uuaisg . if the ai.crocoj-e is inoperable, 

a privileged -11 can use the OTE's diagnostic bus to. access KL 
memgry. 

Both types of transfer {examine/deposit and byte transfer) are 
controlled in part by locations in the exec process table. These 
locations begin at octal 140: 

140 +> 8*N To 11 byte pointer 

141 + 8*N To 13 byte pointer 

142 + 8*N DTE-20 interrupt instruction 

143 + 8*N Reserved for DEC hardware 

144 +• 8*N Examine- protection word 

145 + 8*N Examine relocation word 

146 +• 8*N Deposit protection word 

147 +> 8*N Deposit relocation word 

where N is in the range 3-3. and denotes the DTE under consideration! 

Here is a more detailed description of these locations. 
•- To -11 byte pointer — a byte pointer, set uo in standard KL 
format, that tells the DTE what data, to transfer to the -11. The 
painter directs the DTE to exec-virtual addresses. The length of 
the string is determined either by a count or by the oresence of 
a null byte at the end. of the string, at the option of the 
programmer. 

• To-13 byte pointer — same as to-11 pointer, with the obvious 
exception that this pointer is used on to-KL transfers from the 

• DTE-23 interrupt instruction — contains the instruction that 
will be performed as an interrupt instruction when the DTE 
interrupts the KL. The DTE is a vectored- inter ruot device, so it 
does not interrupt through EPT locations 4.8+2N and 41+2N as many 
older devices do.. Instead, the interruot instruction is taken 
from this location. 

Please note that the interrupt causing this instruction to be 
executed will be caused by events such as transfer comolete and 
intar-CPD doorbell. Level interrupts arising from byte 
transfers will not go through this location? indeed, they will 
not produce an .interrupt visible to the operating system at all. 

• Examine protection word — contains the length of the to-11 
window. The length is expressed in 36-bit words. 

• Examine relocation word — contains the beginning ohysical 
address of the to-11 window. 

• Deposit protection word — contains the length of the to-KL 
window. 
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• Deposit relocation word — the physical address of the to-KL 
window. 



Certain front-end operations and equipment are found in all 

forms of KL system while others are not- Section 3.4. 1 describes 

tho-a^ fpafctirac nnmmi-in ^n all TT.= uhil^ c^ /-. +. ■>' » r, *• T a "> ~_J i a ? 

describe the -10 and -20 Front Ends, respectively. 



Common Front-End Operations 

Please study Figure 36 as you read this section. 

All KL-based systems rely on the Front End for at least two 
basic functions. First, the -11 is responsible for initiating KL 
CPU operations from a dead stop. This process involves setting up 
KL status, loading all microstores, configuring KL memory, and 
starting the monitor bootstrap. These operations are conducted 
primarily across the diagnostic bus which is shown in Figure 36 
connecting the Front End to both the Ebox and the Mbox. The second 
job of the -11 is to support the console terminal by which an 
operator can control the system. It is this terminal that governs 
the_ -11 operating system and the tasks running under it. In 
addition to controlling the -11 operating system, the console 
terminal can talk directly to TOPS-10 or TOPS-20, thus acting as a 
terminal as well. 

The key element of these jobs is the PDP-11 ' s operating system. 
The systems used vary somewhat with processor type, but all are 
based on the RSX operating system. The currently supported 
front-end monitor is RSX-20F. This system runs multiple tasks. One 
task is the "command parser," which is the program that recognizes 
commands typed in on the console terminal. Other tasks include 
KLINIT, which oversees initialization of the KL processor, and 
KLINIK, which provides a telephonic link that permits diagnosis and 
control of the KL from remote locations. 

Devices associated with the Front End include the RH11 disk 
controller, which supports RP04/06 disk drives. Current front-end 
operations require the RH11 to be connected to a dual-ported disk of 
which the other port is connected to a KL controller {RH10 or RH20) . 
There are both software and hardware interlocks to prevent the KL 
and the -11 from interfering with one another. The disk used has 
several tracks formatted in PDP-11 format, while the rest of the 
disk is KL formatted. In addition to the RHll, the Front End has 
either a floppy disk drive or a DECtape drive. These are used as an 
alternate bootstrap device if, for some reason, the disk cannot be 
used, or contains obsolete data. 
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KL systems can be attached to up to four POP-lls. Only one of 
these, however, can be the controlling front end. In order to 
prevent conflicts between different -lis, the operations described 
in this section can only be done using a "privileged" DTE. A DTE is 
made privileged" or "restricted" by the setting of a manual switch 
located on the DTE. Hestricted DTEs can still move data between the 
<1 and the -II7 snch- -transfers require the -11 -to communicate and 
cooperate with^ the 5L^ using a software pro-tocbl f however, which 
presupposes that it is already running correctly. Only the 
privileged Front End can alter the KL state without permission from 
the KL itself. 



DECsystem-10 Front-End 

The configuration of the -10 Front End deoends upon the use 
intended for it. The controlling Front End (i.e. that attached to 
a privileged. DTE) will have only those devices shown in Figure 36, 
and does no more than what was mentioned in Section 3.4.1. 

Some. -13s have multiple DTEs. One of these will be the 
controlling Front. End, and the others are part of a DN37S 
■communications unit. Sere is the structure of the DN87S; 
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The DNS7S includes such basics as -11 memory, the DTE, and the 
PDP-11, since the machine couldn't function otherwise. In addition 
it has DHli and DQll line interfaces. The ESQ handles as many as 16 



<5S> 



Document on the KL Processor 
Front End Subsystem 



asynchronous lines. The DQ11 provides space for a single 
synchronous line for a remote station or link. The total capacity 
of the DN87S is diagramed below. This permits attachment of 
up to H2 asynchronous lines (using seven DHlls) , or twelve 
synchronous lines (using 12 DQlls) , or any combination. For 
instance, one could nm 54 asynchronous and 4 synchronous lines on a 
single DTI87S. combinations other than those shown are also allowed. 

Note that the diagnostic bus is absent from Figure 37. 



LIKE MAXIMUMS PES CN87S 



Max. No. of 

Sync. Lines 


Max. NO. of 

Async. Lines 





112 


4 


64 


8 


32 


12 
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3.4.3 DECSYSTEM-20 Front End 
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Typical DECSYSTEM-20 Front-End 
Figure 38 



... The " 2d Frant 2nd is a much busier system under TOPS-20. in 
ISb 1 ?; 011 t0 handUn S those functions described in 3.4.1, DTE -based 
PDP-ils are responsible for handling all communications traffic and 
all unit-record equipment. 
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I/O Subsystem 
Figure 39 
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The I/O subsystem has three possible links to the rest of the 
system. 

• Ebus — this connects all I/O devices to the Ebox. It is through 
the Ebus that devices receive control signals from the CPU and 
return device status to the C2U- 

9- Cbus — • acts a« a data-channel between 2H22 controllers and <•>»« 
Jibox. For a complete description please read Section 3.2.4. 

* External channels — - data-channels between memory and controllers 
for those controllers not able to use the Cbus. 

The only component per se of the I/O subsystem, other than I/O 
devices themselves, is the DIA. The need for the DIA arises from 
the fact that the KL's Ebus uses a different hardware protocol than 
the KA/KI I/O bus, even though the basic purpose of both is the 
same. Only on -13 systems, it is necessary to connect older I/O 
devices to KL systems; devices that were designed to use the I/O 
bus protocol. To solve this problem, -13 systems are ecuipoed with 
the DIA, which accepts a conventional I/O bus on one side and the KL 
Ebus on the other. The DIA is not needed on the -23 because the 
kinds of devices that need the I/O bus (e.g. unit-record equipment) 
is not connected directly to the KL at all on -20s; instead," they 
are attached to the Front End -11. 
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Previous Context Execute 

Normally, an instruction's address references are handled 
completely witnin the current context. I.e., if an instruction is 
issued in user mode, then all its address references are handled as 
user-virtual addresses. There are situations, however, wiier-e it is 
coawwiaa* to cause an <ex«c instruction to reference Tiser-virtual 
addresses. Tor instance, suppose a user process issues a monitor 

Cart" (UUO Or JSYS) that involves an arniraiari-l- HI rt^V :»■(. „■-.. ,jj 

i^,' % ™. ^monitor cannot read the first argument word by saying 
MOVE AC, 770"; that would result in the acquisition of exec word 
770, not user word 770. Theoretically, the monitor can set up a new 
page-map entry to point to the desired user page, but this procedure 
requires many instructions, and would adversely affect the operation 
of the pager. 

The problem is solved on KL systems by use of the PXCT 

xrT^^L 5 ontsx ^ eXeCuTe) instruction. PXCT, like a conventional 
xct, loads an instruction from the location soecified by the PXCT's 
effective address. Unlike a conventional XCT*, the instruction XCT'd 
will be treated, in whole or part, as an instruction performed in 
the previous context"; that is, in the processor mode the 
processor was in when the most recent monitor call occurred. 
Normally, the previous context will be user mode (public or 
concealed). It could, however, be an exec mode. 

Reconsider our earlier example. The monitor wishes to read 
user address 770. Rather than juggle page maos , the monitor would 
issue this instruction: 

PXCT 14, [MOVE AC, 770] 
This instruction results in "user address 770 being put into exec 
"AC". (Note that the number "14" in the instruction* does not refer 
to AC 14; rather, the -bit pattern 1100 in the AC field of the PXCT 
determines the treatment of the MOVE. This matter is discussed 
shortly.) 

PXCT has the same opcode as XCT; an XCT becomes a PXCT when: 

• the XCT is performed in exec mode, and 

• the XCT's AC field (instruction bits 9-12) is non-zero. 

By way of example, the instruction "XCT 0, anything" would be a 
conventional XCT, with the target instruction being treated as 
belonging to the current context. However, the instruction "XCT 
5, anything" would be handled in the fashion described below. 

It should be noted that the instruction name PXCT is an exact 
synonym for XCT; the distinction betweem the two names is Durely 
mnemonic. Proper operation of the PXCT depends on the programmer 
setting up the PXCT's AC field. 
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Correct PXCT behavior requires that the hardware know what the 
previous context was. Previous context is completely defined by the 
following 3 items: 

* previous context AC block number (0-7) 

• previous context mode (user or exec) 

•■ previous context protection (public or concealed) 

~aTt lastrsctAgns - na-t just. PXCT ? require the *feTansistit?ts "fff 
several virtual addresses. For example, suppose the CPU has just 
processed an instruction, and a new instruction is to be fetched and 
performed. The instruction will be fatched from the address given 
in the processor's PC word. PC addresses are- virtual, so that 
address must be converted to physical before the new instruction can 
even be found. Now think ahead to the point where the instruction 
has been found and brought into the Ebax. The effective address 
must be computed. , and that process- involves translation, too. 
Consider the instruction "MOVE 5, 1343". The address 1343 must be 
translated to physical. In addition to all this, the system must 
also figure out which of the eight possible AC blocks is to be used. 
Otherwise the. right ACS cannot be found. 

For. most instructions, all such memory references, will be 
treated as belonging to the- current context. In using PXCT, 
however,, the programmer has a choice regarding the way some , but not 
all,. memory references are treated. The following types of 
instruction reference will always- be exec mode: 

♦ Fetch of the PXCT itself. This is only natural, since until the- 
instruction has been fetched, the system doesn't even know it's a 
PXCT. 

* Resolution of the effective address of the PXCT'; i.e., the 
address of the target instruction is always an exec address. 
This too is a necessary function of the way the hardware 
operates: effective addresses are computed before the 
instruction opcode is looked at. 

•- AC field in the target instruction; This is not to say all AC 
references by the target instruction are exec. For instance, 
"PXCT ?,[MOVE 5,1303]" would always move 1330 of the previous 
context into exec AC 5, because the number 5 is in the target 
instruction's AC field (bits 9-12). By contrast, "PXCT ?,[MOVE 
5,6]" might either move user 6 into exec 5, or exec 6 into exec 
5, depending on the value of PXCT bits 9-12. The option exists 
in the latter instruction because the number 6 appears in the 
target instruction's Y. field, not the AC field. 

Other references may be either in user space or exec space, at 
the programmer's option. This choice is exercised using PXCT bits 
9-12. The meaning of a "1" in any of these bits varies somewhat 
according to the target instruction, so we will treat three 
different classes: general, BLT, and EXTEND. A "1" in a position 
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signifies that the corresponding sort of reference is treated as a 
previous context address. 



4-1 Genaxal lastruciiooi 



position 

9 Effective address calculation for target instruction 

10 Memory operands specified by E, whether fetch or store. (E.g. 
source address in MOVE or PUSH, destination in ADDM) . 

11 Not applicable - must be 3. 

12 Applicable only to PUSH and POP — address of stack as 
reflected by stack pointer. 



4.2 BLT And XBLT 

Bit 

position 

9 EA calculation of BLT 

13 Destination address (from BLT AC right half) 

11 Not applicable — must be 

12 Source address (from BLT AC left half) 

For example, this instruction sequence will copy a 50 word block 
from user address 460 to exec address 702. 



MOVE AC, [460,, 702] 
PXCT 1, [BLT AC, 751] 



:SET UP BLT AC WITH [SOURCE, DESTINATION] 
; EFFECTIVE ADDRESS OF BLT DENOTES LAST 
. . LOCATION TO BE WRITTEN 

Only PXCT bit 12 is set, causing only the BLT source address to be 

treated as a user address. 



4.3 EXTEND 

Bit 
position 

9 EA calculation of both instruction words. Also EXTEND EA 
calculation of source pointer if bit 11=1, and of destination 
pointer if bit 12=1. 

10 Memory reference of second instruction word. 

11 EA calculation of source, and EA calculation of source pointer 
if bit 9*1. 

12 Destination, and EA calculation of destination pointer if bit 

9=1. 
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KL PAGING GLOSSARY 



E3R 

Exec Base Register 



Exec Base Register. 
An internal Mbox 



register that holds 



the 



Private oace 



Section pointer 



the Exec Process Table. 
address space of exactly 
to by an immediate page 
pointed to by a shared 



physical page number of 
A page belonging to the 
one process. Pointed 
map pointer; it is not 
pages table entry. 

One word of data, residing in the exec process 
table or user process table word 440, that 
describes the location of a page map. 
Special/Shared Pages Table A single in-core table comprising a 

series of physically contiguous pages: " it 
contains the addresses of those pages being 
shared between - process and a file, and 
addresses of special pre-process data base 
tables maintained by the monitor. 
Shared Pages Table. 
User Base Register 
ster An internal Mbox register that holds the 
physical page number of the current User Process 
Table. 

A one page table used in conjunction with 
indirect section pointers. 



3PT 
U3R 
User Base Regi 



Section table 
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Chapter 3 

KL10 System Operations 



The information presented in this chapter is primarily for Digital's own 
system progr amm ers, for their use in writing the Monitor and other soft- 
ware. However it is also needed by anyone who wishes to write his own 
operating system, to some extent by users who handle their own 10, and by 
programmers in a situation where all the facilities of a system are dedi- 
cated to a single large task. 



WARNING 

KL10 functions are implemented in microcode, which can be 
changed much more easily than hardware. Although user op- 
erations are deliberately kept as compatible as possible from 
one machine to the next, Digital will change the KL10 sys- 
tem microcode whenever such change will result in greater 
speed, efficiency or effectiveness. Therefore anyone writing 
system software should make sure to use the most recently 
updated version of this documentation, and before embarking 
on any project as enormous and critical as an operating sys- 
tem, to check with Large Systems Engineering for any 
changes not yet documented. 



Programming for the system as a whole is programming in executive 
mode. Only the kernel program is without instruction restrictions, and only 
it can, if needed, access physical memory unpaged. The supervisor program 
labors under the same instruction restrictions as the user and has no way of 
bypassing them, although it can read but not alter concealed pages (the 
kernel program can supply data tables to the supervisor program, and the 
latter cannot affect them). 
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The amount of useful work done by the system depends upon how 
efficiently and effectively the executive manages the system. This means 
selecting which processes will run when, managing their working sets, re- 
sponding to their needs, and even reacting to error situations or perhaps 
downright unacceptable behavior on the part of a user. The kernel program 
accomplishes these objectives by handling all in-out for the system, setting 
up page maps, trap locations, interrupt locations and the like for both itself 
and the users, handling user accounts, communicating with the front end, 
and so forth. In other words, except for handling in-out, the activities of an 
operating system are the topics covered in this chapter. Of course the sys- 
tem programmer must also be quite familiar with all of the material pre- 
sented in the preceding chapters. In particular he must fully understand 
the architecture of the system as discussed in Chapter 1, and must be 
especially well versed in the use of the JRST instruction, MUUOs, and 10 
instructions (§§2.9, 2.16, 2.18). 

System information for other processors is given in Chapters 4 and 5. 
The present chapter is devoted solely to the KL10, but contains two sections 
on paging, only one of which is applicable to a given system. §3.3 describes 
the paging used with the TOPS-10 Monitor, this paging is similar to that of 
the KI10. §3.4 treats the paging associated with the TOPS-20 Monitor. 
Both kinds of paging employ essentially the same hardware — the differ- 
ence lies principally in the microcode. 

Much of the material presented here is related to the D.TE20s, the 
channels, and the DIA20. Although the chapter does describe all activities 
of the microcode undertaken for these devices (e.g. the front end functions 
in §3.7), the descriptions of the devices themselves are not included. 



3.1 Priority Interrrupt 

The DECSYSTEM-20 is essentially a system of processors clustered 
around the E bus. The various controllers and interfaces are subsidiary to 
the PDP-10, but maintain a considerable degree of independence from it. 
Each RH20 Massbus controller operates from its own command list in 
memory and handles all data transfers via the channels; but it must reach 
the Ten program to start a new list or if something should go wrong. Each 
PDP-11 is a whole computer with its own internal program; but for han- 
dling 10 equipment or acting as the system console, it must communicate 
with Ten memory via the E bus (to which it is interfaced by a DTE20), and 
the peripheral computer must reach the Ten program for setting up mutual 
operations. Basically the priority interrupt system allows the other proces- 
sors to interrupt the central processor at various levels of priority, so that 
all can operate simultaneously. The hardware also allows conditions inter- 
nal to the PDP-10 to signal its own program by requesting an interrupt. 

In a DECsystem-10, the PDP-11 is limited to use as a system console 
and diagnostic facility, and the unit-record peripheral equipment is organ- 
ized around a KIKMype 10 bus connected to the E bus via a DIA20 10 bus 
interface. If the system lacks internal channels, Massbus controllers must 
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be of the RH10 type, which the program controls via the 10 bus. For data 
purposes an RH10 is connected to external memory by a separate memory 
bus. It is recommended that those who program a DECsystem-10 read both 
this section and the first few pages of the discussion of the KIIO interrupt 1 
(§5.2). 

Interrupt Requests 

Interrupt requests are handled on eight levels arranged in a priority se- 
quence. Levels are numbered 0-7, with having highest priority. Level is 
quite unlike the others, however, in that it is available only to the front end 
processors for simulating console functions and handling. byte transfers. 
Moreover level is always active — it cannot be turned off even by inac- 
tivating the interrupt system. The program does control the enabling of 
level in the DTE20s, but the master front end can even override that. 
Assignment of devices 2 to the remaining levels is entirely at the discretion 
of the programmer. To assign a device to a level, the program sends the 
number of the level to the device control register as part of the conditions 
given by a CONO (usually bits 33-35); a zero assignment disconnects the 
device from the interrupt levels altogether. Any number of devices can be 
placed on the same level. 

When a device requires service, it sends an interrupt request signal on 
its assigned level over the bus to the processor. A request is recognized by 
the processor if the level is active — meaning that both the interrupt sys- 
tem and the individual level 3 have been turned on. But the processor can 
accept no requests while it is processing a request or starting an interrupt 
at any level, or holding an interrupt on the same level or on a level with 
higher priority than those on which requests have been recognized (in other 
words, if the current program is a higher priority interrupt routine). The 
request signal remains on the bus however until turned off by an appropri- 
ate response from the processor either given by the program (CONO, 
DATAO, or DAT AI, depending on the device), or generated automatically 
by the hardware. Thus if a request is not recognized or accepted when 
made, it will be when the necessary conditions are satisfied. A single level 
will even shut out all others of lower priority if every time its service 
routine dismisses the interrupt, a device assigned to it is already waiting 
with another request. 



1 On the Ten side of the DIA20, the interrupt works as described here. But on the other side 
it acts more like the KIIO interrupt, with seven programmable levels, second-order prior- 
ity determined by proximity to the DIA20, etc. Of course the processor activities and 
interrupt functions available are those of the KL10. 

2 As explained in §2.18, the program treats all E bus controllers, internal subsystems, and 
10 bus peripherals as 10 devices. In other words, it monitors and controls them by means 
of 10 instructions using appropriate device codes. For a PDP-11, the device is the DTE20. 

3 Remember that level is always, active, even when the interrupt system is off. In other 
respects this discussion applies to all levels. 
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The request signal is generally derived from a flag that is set by vari- 
ous conditions in the device. Often associated with these flags are enabling 
flags, where the setting of some device condition flag can request an inter- 
rupt on the assigned level only if the associated enabling flag is also set. 
The enabling flags are in turn controlled by the conditions supplied to the 
device by a CONO. For example, a device may have half a dozen flags to 
indicate various internal conditions that may require service by an inter- 
rupt; by setting up the associated enabling flags, the program can deter- 
mine which conditions shall actually request interrupts in any given cir- 
cumstances. 

Processing a Request. The processor handles only one request at a 
time. When it is ready, it accepts the highest priority request currently 
recognized, provided that request is on a level higher than the current 
program (all levels are higher than a noninterrupt program). To process a 
request the hardware sends an interrupt service demand to the devices on 
the E bus to determine which ones are currently requesting ah interrupt on 
the accepted level. Note that at this point the processor is accepting not an 
individual request, but rather a class of requests: namely all those being 
made on the same level. Should the bus be busy, the demand is sent as soon 
as it becomes available, taking precedence over any 10 instruction that 
may also be waiting (note that in this situation the program actually stops). 
From among the devices that respond to the demand on the accepted level, 
the processor selects the one of highest priority 4 according to this schedule: 

Physical 
Devices in Order of Decreasing Priority Device Numbers 5 



Interval counter 

Other internal requests — processor error 

flags, program initiated request 

Channels 0-7 °" 7 

DTE20s 0-3 10 - 13 

DIA20 — i.e. any device on the 10 bus 17 



4 There are therefore two orders of priority associated with an interrupt: first the level, and 
then for ail devices requesting interrupts simultaneously on the same level, physical de- 
vice number. These physical numbers are not the device codes used in the 10 instructions; 
they are just for interrupt priority purposes and depend on position on the backplane (the 
RH20s are ordered opposite from the slot numbers). 

5 Physical numbers 14-16 are not used. 
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If the device selected. is internal, no further processing of the request is 
required. Otherwise the hardware sends a function demand to the selected 
device (by specifying its physical number along with the interrupt level), 
and the device responds by returning an interrupt function word. In either 
case, once all necessary information about the request has been gathered, 
the interrupt system waits for the interrupt to start. The microcode checks 
frequently for a waiting request, and upon discovering one departs from its 
normal routine to start an interrupt. At such time PC points to the inter- 
rupted instruction, so a correct return can later be made to the interrupted 
program. 

Interrupt Functions and Instructions 

The action taken by the microcode to start an interrupt depends upon the 
function specified by the function word returned to the processor. Two fixed 
locations in the executive process table are associated with each level, loca- 
tions 40 + 2N and 41 + 2N, where N is the level number. Level 1 uses 
locations 42 and 43, level 2 uses 44 and 45, and so on to level 7 which uses 
56 and 57. The processor starts a "standard" interrupt for level iVby exe- 
cuting the instruction in the first interrupt location for the level, i.e. loca- 
tion 40 + 2N. This type of interrupt is performed for a processor error or 
program-initiated request, for an external device whose function word spec- 
ifies a standard interrupt, and also for an 10 bus device that returns no 
function word. The fixed locations however need not be used. The interrupt 
function word sent by the device may specify an equivalent interrupt using 
a pair of locations selected by the function word, or some other interrupt 
function entirely. The function word has this format. 



ADDRESS 
SPACE FUNCTION 



TT 


DEVICE 


00 


INTERRUPT ADDRESS 



2 3 S 6 7 10 II' 12 13 35 



The microcode acts from a function word whether there is one or not; its 
absence is taken as a zero function. The DIA20 returns the word supplied 
over the 10 bus or simulates a zero word. Bits 7-10 identify the device by 
its physical number, but this is supplied by the interrupt hardware, not the 
device. The meanings of the other bits in the word are as follows. 

0—2 In unrestricted examine and deposit functions, codes given in 

these bits select the space in which the address supplied in bits 
13—35 is interpreted. 

Executive process table 

1 Executive virtual address space 
■ 4 Physical address space 

Remaining codes are reserved. 
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3-6 Interrupt function (bits 3-5), sometimes qualified by Q (bit 6). 

When unspecified, Q is irrelevant. The microcode handles func- 
tions 4-6 even when it is in the halt loop. 

Internal device or zero word: for the interval counter perform a 
vector interrupt (see function 2); otherwise perform a standard 
interrupt (see function 1). 

1 Standard interrupt — execute the instruction in location 
40 + 22V of the executive process table. 

2 Vector interrupt — action depends on device type as aO^ows: 

Interval counter — execute the instruction in location 514 
of the executive process table. 

DTE20 — execute the instruction in location 2 of the corre- 
sponding DTE20 control block. 6 

Channel — execute the instruction in the executive process 
table location specified by bits 27-35. 
DIA20 — dispatch interrupt: execute the instruction in the 
executive virtual location specified by bits 13-35. 

3 Increment — depending on whether Q is or 1, add 1 to or 
subtract 1 from the contents of the executive virtual location 
specified by bits 13-35. 

4 Examine — send the contents of the specified location to the 
selected DTE20. If Q is 0, select the location according to bits 
0-2 and 13-35. If Q is 1, use bits 14-35 as a physical address 
and restrict the function to the communication area defined in 
the DTE20 control block. 6 The examine is effected by perform- 
ing a DATAO to the DTE20. 

5 Deposit — load the word supplied by the selected DTE20 into 
the specified location. If Q is 0, select the loaction according to 
bits 0-2 and 13-35. If Q is 1, use bits 14-35 as a physical 
address and restrict the function to the communication area 
defined in the DTE20 control block. 6 The deposit is effected by 
performing a DATAI to the DTE20. 

6 Byte transfer — increment the byte pointer for the direction 
specified by Q (0 out, 1 in) from the control block for the se- 
lected DTE20, and then move a byte between Ten memory and 
the DTE20 according to the altered pointer. 6 

7 Reserved (produces a standard interrupt at present). 

CAUTION 

• Because of the special cycle in which it is executed, an inter- 
rupt function that uses virtual addressing cannot employ in- 
direct pointers in its paging procedure (§3.4). 



6 For further information on front end interrupt functions, refer to §3.7. 
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13-35 The bits among these that supply the address when the function 
requires one depend on the address space. 

Executive process table 27-35 

Executive extended virtual address space 13-35 

Executive unextended virtual address space 18-35 

Physical address space 14-35 

Regardless of what mode the processor is in when an interrupt occurs, 

executive virtual address space unless the particular function selects some 
other form of addressing. A page failure that occurs in an interrupt opera- 
tion is never trapped; instead it sets the In-out Page Failure flag, which 
requests an interrupt on the level assigned to the processor (§3.8). These 
considerations of course do not apply to a service routine called by an inter- 
rupt instruction. 

Interrupt Instructions. An instruction executed in response to an 
interrupt request and not under control of PC is referred to elsewhere in 
this manual as being "executed as an interrupt instruction." Some instruc- 
tions, when so executed, have different effects than they do when performed 
in other circumstances. And the difference is not due merely to being per- 
formed in an interrupt location or in response (by the program) to an inter- 
rupt. To be an interrupt instruction, an instruction must be executed in the 
first or second interrupt location for a level, in direct response by the hard- 
ware (rather than by the program) to a request on that level. These loca- 
tions may be the fixed ones for a standard interrupt or those given by the 
function word for a vector interrupt. §2.17 describes the two ways a BLKO 
"is performed. If a BLKO is contained in an interrupt routine called by a 
JSR, it is not "executed as an interrupt instruction" even in the unlikely 
event the routine is stored within the interrupt locations and the BLKO is 
executed by an XCT.-There are two types of interrupt instructions executed 
in a standard or dispatch interrupt; the effects of all other instructions are 
undefined. 

BLKI, BLKO. If the pointer count is not zero, the processor dismisses 
the interrupt and returns immediately to the interrupted program (i.e. 
it returns control to the unchanged PC). If the count is zero, the proces- 
sor executes the instruction contained in the second interrupt location. 

XPCW, JSR. The processor holds an interrupt on the level, takes the 
next instruction from the location specified by the jump (as indicated by 
the newly changed PC), and enters either kernel mode or the mode 
specified by the new flag word of the XPCW. Hence the instruction is 
usually a jump to a service routine handled by the Monitor. XPCW is 
the preferred instruction on the extended KL10. 

The most important point of which the programmer must be aware is 
that even while User is set, the interrupt instructions are not part of the 
user program. They are executed in kernel mode and are therefore subject 
only to kernel mode restrictions. Regardless, of the current PC section, the 
address part of an interrupt instruction is interpreted as referencing sec- 
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tion 0, except in a dispatch interrupt, where it references the section speci- 
fied by the interrupt function word. As an interrupt instruction, JbK auto- 
matically clears both User and Public to jump to a kernel mode- service 
routine An XPCW should be set up to produce the same .result. The XFLW 
control block must be in section unless the interrupt is a dispatch. 

CAUTION 

Because of the special cycle in which an interrupt instruction 
is executed, the paging procedure for it cannot employ indi- 
rect pointers (§3.4). 

Interrupt Programming 

The program can control the priority interrupt system by means of condi- 
tion 10 instructions. The device code is 004, mnemonic PI. 



CONO PI, 



Conditions Out, Priority Interrupt 



70060 



12 13 14 17 18 



3 



3S 



Perform the functions specified by the effective conditions E as shown 8 (a 1 
bit produces the indicated function, a has no effect). 



in a 



DROP PROGRAM 
REQUESTS ON 
SELECTED 

LEVELS 



INITIATE 

INTERRUPTS 

ON 



WRITE EVEN 
PARITY 



A 0ORE5S| OAT A | OIRCTRT 
18 19 20 



CLEAR 

PI 
SYSTEM 



22 



23 



TURN 
ON 



TURN 
OFF 



SELECTED LEVELS 

' ' 



24 



25 



26 



TURN 
OFF 



TURN 
ON 



PI SYSTEM 



SELECT LEVELS FOR BITS 22,24,25,26 
! 2 I 3 I 4|5|6]7 



27 



28 



29 



30 



31 



32 



22 On levels selected by Is in bits 29-35, turn off any interrupt requests 
made previously by the program (via bit 24). 

23 Turn off the priority interrupt system, turn off all levels, drop ail 
program-set requests, and dismiss all interrupts that are currently 
being held. 

24 Request interrupts on levels selected by Is in bits 29-35^ and force the 
• processor to recognize them even on levels that are off. The request 

remains indefinitely, so as soon as an interrupt is completed on a 
given level another is started, until the request is turned off by a 
CONO that selects the same channel and has a 1 in bit 22. 



33 



34 



35 



7 Data instructions with device code PI are unsigned and execute as MUUOs. The block 
instructions are used for error and diagnostic purposes (§3.8). 

8 Bits 18-20 are for test purposes only. They are used to force errors and are discussed in 
§3.8. 
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25 

26 

27 
28 



Remember that the processor allows the program to continue 
while it processes a request. Thus when this bit forces recognition of a 
request, many additional program instructions may be performed be- 
fore the interrupt, even on the highest priority level. Moreover if the 
request is allowed, to remain, additional instructions may be per- 
formed between successive interrupts. For other than the highest pri- 
ority level, the greater the number of higher levels active, the greater 
the amount of program time available both initially and between suc- 
cessive interrupts. If the program forces an interrupt on the lowest 
level when all are active, there can be a very long time between 
CONO PI, and its interrupt. 

Turn on the levels selected by Is in bits 29-35 so interrupt requests 
can be recognized on them. 

Turn off the levels by Is in bits 29-35, so interrupt requests cannot be 
recognized on them unless made by a CONO PI, with a 1 in bit 24. 

Turn off the interrupt system so no requests can be recognized. 

Turn on the interrupt system so the hardware can process requests. 



CONI PI, 



Conditions In, Priority Interrupt 



70064 / 


X 


Y 



!2 13!4 17 18 



35 



Read the status of the priority interrupt (and several diagnostic bits) into 
location E as shown. 



PROGRAM REQUESTS ON LEVELS 
2 ! 3 ! 4 15 16 



\_L 



10 



12 



13 



14 



15 



IS 



WHITE EVEN 
PARITY 




INTERRUPT IN PROGRESS ON LEVELS 




Pi 

StSTEM 






LEVELS ON 






iOORESSl DAT4 |0IRCTRf 


1 


1 2 1 3 1 4 | 5 1 6 1 


7 


ON 


1 


2 


3 1 4 I 5 


6 


7 


18 19 20 


21 


22 23 ! 24 25 26 ' 


27 


28 


29 


30 


31 32 ' 33 


34 


35 



Levels that are on are indicated by Is in bits 29-35; Is in bits 21-27 indi- 
cate levels on which interrupts are currently being held; and Is in bits 
11—17 indicate levels that are receiving interrupt requests generated by a 
CONO PI, with a 1 in bit 24. A 1 in bit 28 means the interrupt system is on, 
and Is in bits 29-35 therefore indicate active levels. 

The remaining conditions read by this instruction have nothing to do 
with the interrupt. Bits 18-20 reflect several diagnostic functions discussed 
in §3.8: 



Dismissing an Interrupt. Unless the interrupt operation dismisses 
the interrupt automatically, the processor holds an interrupt until the pro- 
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gram dismisses it, even if the interrupt routine is itself interrupted by a 
higher priority level. Thus interrupts can be held on a number of levels 
simultaneously, but from the time an interrupt is started until' it is dis- 
missed, no interrupt request can be accepted on that level or any of lower 

priority. 

A routine dismisses the interrupt by using an instruction that restores 
the level on which the interrupt is being held at the same time it returns to 
the interrupted program. The proper instruction is XJEN (JRST 7,) in an 
extended KL10, otherwise JEN (JRST 12,). Once the level is restored, the 
i»««j„r„r.o «ot, ofroi" accent rennests and start interrunts on it and lower 
priority levels. These instructions also restore the flags: XJEN from the 
flag-PC doubleword if the routine was called by an XPCW; JEN from the 
left half of the PC word if the routine was called by a JSR in section 0. 
XJEN also restores the previous context section if the return is being made 
to an executive program. 

CAUTION 

An interrupt routine must dismiss the interrupt when it re- 
turns to the interrupted program, or its level and all levels of 
lower priority will be disabled, and the processor will treat 
the new program as a continuation of the interrupt routine. 

Timing. The maximum time a device may wait for an interrupt to 
start depends on how many active devices are of higher priority and how 
long their service routines are. When, a given request is of highest priority, 
its device need never wait longer than 10 p.s. 

Special Considerations. When an interrupt occurs, PC points to the 
interrupted instruction (or to an XCT that executed it), unless the interrupt 
occurred in an overflow trap instruction, in which case PC points to the 
instruction that overflowed. After taking care of the interrupt, the proces- 
sor can always return to the interrupted instruction. Either a) the instruc- 
tion did not change anything; b) the interrupt was in the second part of a 
two-part instruction, where First Part Done being set prevents the proces- 
sor from repeating any unwanted operations in the first part; or c) the 
interrupt occurred at some point in a multipart instruction where the mi- 
crocode rigged the various pointers and other quantities so the processor 
actually restarts the instruction where it stopped, rather than from the 
beginning. However, in a BLT and in byte manipulation, the very mecha- 
nism that facilitates the return results in special properties of which the 
programmer must be aware. 

An interrupt can start following any transfer in a BLT. When one does, 
the BLT puts the pointer (which has counted off the number of transfers 
already made) back in AC. Then when the instruction is restarted following 
the interrupt, it actually starts with the next transfer. This means that if 
interrupts are in use, the programmer cannot use the accumulator that 
holds the pointer as an index register in the same BLT, he cannot have the 
BLT load AC except by the final transfer, and he cannot expect AC to be 
the same after the instruction as it was before. 
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An interrupt can also start in the second effective address calculation 
in a two-part byte instruction. When this happens, First Part Done is set. 
This flag is saved as bit 4 of a flag word, and if it is restored by' the inter- 
rupt routine when the interrupt is dismissed, it prevents a restarted ILDB 
or IDPB from incrementing the pointer a second time. This means that the 
interrupt routine must check the flag before using the same pointer, as it 
now points to the next byte. Giving an IDLE or IDPB would skip a byte. 
And if the routine restored the flag, the interrupted IDLB or IDPB would 
process the same byte the routine did. 

Programming Suggestions. The Monitor handles all interrupts for 
user programs. Even if the User In-out flag is set, a user generally cannot 
reference the interrupt locations to set them up. Procedures for informing 
the Monitor of the interrupt requirements of a user program are discussed 
in the Monitor manual. 

For. those who do program priority interrupt routines, there are several 
rules to remember. 

• Use interrupt instructions in a manner consistent with the special ef- 
fects and conditions applicable to such instructions as described above. 

• No request can be accepted, not even on higher priority levels, while a 
request is being processed or an interrupt is starting. Therefore do not use 
lengthy effective address calculations in interrupt instructions. 

• To prevent a device from hanging up a level, the programmer must be 
aware of — and satisfy — whatever requirements the device has for drop- 
ping the request. 

• The interrupt instruction that calls the routine should be an XPCW on 
an extended KL10, otherwise a JSR. In either case the paging for the in- 
struction must not use indirect page pointers. 

• The principal function of an interrupt routine is to respond to the situa- 
tion that caused the interrupt. Computations and any other time- 
consuming activities that can possibly be performed outside the routine 
should not be included within it. 

• Never turn off the interrupt system in a routine unless it is absolutely 
necessary, and then always turn it back on again as soon as possible. If one 
or more levels can be turned off in place of the entire system, always do 
that instead. 

• If the routine uses a UUO it must first save the contents of the loca- 
tions that will be changed by it in case the interrupted program was in the 
process of handling a UUO of the same type (§2.16). 

• The routine must dismiss the interrupt (with an XJEN or JEN) when 
returning to the interrupted program. Flags and UUO locations should be 
restored. 



3.2 Cache Management 

For the user, the cache is transparent: any program simply gets informa- 
tion from memory and stores information in memory. But use of a cache as 
part of the memory subsystem reduces program time, since the cache is 
faster than the storage modules, and also reduces storage use by the pro- 
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gram, making a larger percentage of total storage cycles available to other 
parts of the system. As explained in §1.7, transfers between processor and 
memory are in four-word groups: storage references are to four locations at 
a time. 9 The cache contains representations of a selection of such location 
groups. One may view the cache as 2048 general purpose registers, organ- 
ized in sets of four, which substitute temporarily for the most frequently 
referenced physical storage location groups. The cache serves this function 
not only for the program, but for all microcode references, including those 
for handling interrupts, traps, page refills, and other automatic operations. 

rm j_1_ _ -1 1 , 1 _J1~„ J.Um mi<1ii> Jansn>4s IlTinn Whether tll*> initial 

ine way me naxawajc iiaxmica wic «ivus ^.^wi***. «. r ~_ .... 

processor reference to a location in a particular group is read or write. 

When the first processor reference to a group is to read the contents of 
one of its locations, memory control retrieves the entire four-word group 
containing the referenced location. The single word requested is supplied to 
the program, but all four are placed in the cache and are validated, i.e. they 
are tagged as words that do represent the true contents of memory. Subse- 
quent references, read or write, to the same group are made to the cache, 
not to storage. If the processor modifies the contents of a location in the 
group, the new word supplied is substituted for the one in the cache loca- 
tion, which is tagged as written. Thus the cache word is different from 
storage but still valid — i.e. it represents what the storage location should 

contain. 

When the first reference to a group is for writing, there is no call to 
storage at all. Instead the hardware sets aside a location group in the 
cache, with the one word in it tagged as both valid and written. Further 
reads or writes of the same location are handled solely with the cache, and 
subsequent writes to other locations in the same group are handled just like 
the first. But a read to a location that has not been written produces a 
storage reference. The requested word is given to the processor, and all 
words in the group that do not already have written representations in the 
cache are inserted into the group entry. 

When storage is being updated or a group entry that is not in use is 
replaced by another, words- just valid can be thrown away. But written 
words must eventually be sent to a storage module. 

Cache Structure. The 2048 locations in the cache are contained in 128 
lines of sixteen each. The lines are identified by the possible group numbers 
in a single page, 0-177. Each line contains four group entries for the given 
number. Each group entry in turn comprises the number of the physical 
page 10 containing the storage group corresponding to the entry and repre- 
sentations of the four locations in the group, each with valid, written and 
parity bits. 



10 



9 Of course memory control does not blindly request four storage cycles for every group even 
when it is known that some are unnecessary. Fewer references are made when some 
locations in a group already have valid representations in the cache, or the first or last 
transfer in a channel block is for part of a group. 

The list of all page numbers makes up the cache "directory." For many hardware func- 
tions the cache is organized in four quadrants. A quadrant contains 128 group entries, 
one from each line. 
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The hardware also includes a mechanism for keeping track of the use 
of the various group entries. Whenever the processor references a group 
whose corresponding line in the cache already contains valid entries from 
four other pages, the hardware puts the new group representation in place 
of the least recently used entry in the line. But in doing so it also updates 
from any representations tagged as written in the displaced group entry. 

Internal Channels. The channels are expected in general to deal with 
the storage modules, but if the cache contains any valid words for a page 
being handled through the channels, the hardware acts as follows: 

In an output operation, any valid representations at locations addressed 
by a channel are taken from the cache instead of storage. 

In an input operation, all data is sent to storage. However any entries 
that are in the cache for locations addressed by the channel are invali- 
dated. 

The reasons for this behavior are apparent. For output any valid words left 
in the cache might as well be taken since that is faster than going to 
storage. Furthermore some valid entries may have been written, and it is 
assumed that storage will certainly not be more up to date than the cache. 
Anything brought in via a channel is assumed to be the correct copy, and it 
should therefore go to storage as the page cannot be in use at the same time 
it is being loaded. Any valid entries left over in the cache must be from 
some previous operation, and they should therefore be invalidated, so any 
future references to those locations will go to storage for the correct copy. 
Should any of the valid leftovers be tagged as written, it is assumed the 
Monitor would have swapped out the modified page before bringing in the 
new. Of course a page used as temporary storage, or to hold counters and 
control words, albeit modified, can just be thrown away. . 



Cache Programming 

The operations the program can perform on or for the cache are three: to 
invalidate, to validate, and to unload. Any of these operations may be car- 
ried out for all entries in the cache or for all entries of a single page. To 
invalidate a location is simply to clear its valid and written bits so it no 
longer represents anything. To validate or unload means to update storage, 
i.e. to write a cached word into storage if it is tagged as written, and to 
clear the written bit. Otherwise validating storage leaves the validity of the 
cache entries unchanged, whereas unloading invalidates all entries, writ- 
ten or not, in the groups being processed (all those in a single page or the 
entire cache). 

Following power turnon in any system, the cache use tables must be 
initialized and the cache invalidated, as its initial state is indeterminate. 
Beyond this, a system with a single central processor and internal channels 
requires no cache programming, as everything is handled adequately by 
the hardware. However if a system contains facilities that bypass the proc- 
essor to deal directly with external memory, whether such facility be an 
external channel or another central processor, then the Monitor must actu- 
ally manage the relationship between storage modules and cache. 
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As an example of such management and to illustrate the difference in 
use between validation and unloading, consider, the situation in which a 
program is through with the data in a particular (modified) page and it is to 
be swapped via an external channel with new data brought into the same 
physical page for later use. The page must be unloaded into storage so that 
subsequently the program will go there for the new data. On the other hand 
suppose a program has created some code in a page, and the system is both 
to go ahead and execute it immediately and place it in a library. Now 
validation is the proper procedure: while the storage copy is being filed, the 
"ro^ram can continue execution from the cache. 

°For initialization and management, there is one instruction that ini- 
tializes the use tables and six that sweep the cache to perform the above 
three operations for a single page or all pages. Note that a sweep of the 
entire cache is always necessary, even for handling a single page, as there 
is no prior way of knowing whether any given line contains a group from 
any given page. Sweeping for a single page does however take less time 
than sweeping for all pages. In the latter case the sweeper must check all 
512 group entries, whereas the former requires checking only every line to 
see if it contains an entry for the specified page, and there can be at most 
one such entry. Moreover sweeping for all pages can usually be expected to 
require more storage references than sweeping for a single page. In this 
light it should be noted that the sweep instructions simply initiate opera- 
tions which are then carried forward by the cache sweeper. The program 
can continue while the sweep is going on, but this can be expected to slow 
down the sweep as the cache and program would then compete for storage 
references. That a sweep is in progress is indicated by the Sweep Busy flag 
being on, and at completion the sweeper clears Busy and sets Sweep Done. - 
The program can check both of these flags among what are otherwise the 
processor error conditions, and it can enable the latter to request an inter- 
rupt on the level assigned to the processor (§3.8). 

These are 10 instructions wherein the cache sweeper has device code 
014, mnemonic CCA. But the instructions have their own mnemonics since 
they bear no relation to the standard 10 operations. Six of the eight are 
used: the BLKI and CONO also sweep, doing nothing but wasting cache 
cycle time. The single instruction that initializes the use tables is discussed 
at the end of the section. 



SWPIA Sweep Cache, Invalidate All Pages (DATAI CCA,) 

£ is not used. 



70 1 44 / 



n 



12 13 14 17 18 3S 



Set Sweep Busy, and clear the valid and written bits in all cache entries. At 
the completion of the sweep, clear Sweep Busy and set Sweep Done, re- 
questing an interrupt on the level assigned to the processor. 



li 



I, X and 7 are reserved and should be zero. 
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SWPIO 



Sweep Cache, invalidate One Page 



(CONI CCA,) 



7 16 4 



12 I J 14 



17 18 



35 



Set Sweep Busy, and clear the valid and written bits in all cache entries for 
the physical page specified by bits 23-35 of E. At the completion of the 
sweep, clear Sweep Busy and set Sweep Done, requesting an interrupt on 
the level assigned to the processor. 



SWPVA 



Sweep Cache, Validate AH Pages 



701 50 


/ 


X 


Y 



• (BLKO CCA,) 

£is not used. H 



12 IJ 14 



17 18 



35 



Set Sweep Busy, and write into storage all cached words whose written bits 
are set. Clear all written bits but do not change the validity of any entries. 
At the completion of the sweep, clear Sweep Busy and set Sweep Done, 
requesting an interrupt on the level assigned to the processor. 



SWPVO 



Sweep Cache, Validate One Page 



(CONSZ CCA,) 



70 170 


/ 


X 


Y 



:t2 n 14 



17 is 



35 



Set Sweep Busy, and write into storage all cached words whose written bits 
are set and which are found in entries for the physical page specified by bits 
23-35 of E. Clear the written bits associated with those words sent to stor- 
age, but do not change the validity of any entries. At the completion of the 
sweep, clear Sweep Busy and set Sweep Done, requesting an interrupt on 
the level assigned to the processor. 



SWPUA 



Sweep Cache, Unload All Pages 



70154 \l 


X 


Y 



(DATAO CCA,) 

£is not used. 11 



12 13 14 



17 18 



35 



Set Sweep Busy, and write into storage all cached words whose written bits 
are set. Invalidate the entire cache, i.e. clear all valid and written bits. At 
the completion of the sweep, clear Sweep Busy and set Sweep Done, re- 
questing an interrupt on the level assigned to the processor. 
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SWPUO Sweep Cache, Unload One Page (CONSO CCA,) 



70 174 



12 13 14 .17 18 35 



Set Sweep Busy, and write into storage all cached words whose written bits 
are set and which are found in entries for the physical page specified by bits 
23-35 of E. Invalidate all entries for the specified page, i.e. clear both their 
valid and written bits. At the completion of the sweep, clear Sweep Busy 

_ i „i. ci— r» — ,„ - „., m *4t,o' 3« infem-int on the level assiened to the 
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processor. 



Management of the cache is relatively straightforward. With external 
channels the program must simply be sure always to update storage pages 
before having them sent out, and to invalidate the cache entries for pages 
being brought in so processor references will go to storage for the new data. 

The same procedures are used for a multiprocessor system, but here a 
problem arises when different processors are allowed to reference the same 
page at the same time, if either is allowed also to modify the page. Without 
modification the cache copies in both processors will remain valid; but if a 
processor modifies the page, the other cannot expect to get up-to-date data 
from cached words, To handle this situation, the pager includes mecha- 
nisms for bypassing the cache. Each page mapping 12 contains a cache bit for 
determining whether cache use is allowed for the given page. This cache bit 
applies only to an individual page, and has no effect at all unless cache use 
is enabled by the cache look bit. Analogous to the mapping cache bit is a 
load bit that applies to all unpaged references (such as pager, references to 
the process tables). The look and load bits are among the conditions the 
Monitor provides to the pager. The way these "cache strategy" conditions 
govern, cache use is as follows. 



Look 

The cache is disabled — go to storage for all references. 

1 Look in the cache for all references. This means always use the 
cache (reading or writing) for any locations that already have valid 
representations. Furthermore when there is no valid representa- 
tion for a reference, load the cache (reading or writing) if either the 
reference is unpaged and the load bit is 1, or the reference is paged 
and the cache bit in the mapping for the page is 1. 



12 For information on page mapping refer to §3.3 or §3.4 depending on whether the system 
uses respectively the TOPS-10 or TOPS-20 Monitor. Instructions for handling the pager 
are discussed in §3.5. 
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Timing. Simple invalidation takes little time, and it interferes mini- 
mally with the program since it requires no storage references. Otherwise 
an average sweep requires on the order of several hundred microseconds, 
but varies widely depending on the number of references required. Allow- 
ing the program to run simultaneously slows down the sweep because of 
competition for storage cycles, but program time is saved nonetheless. 

Initializing the Cache. The use logic contains two tables each with 
128 entries. Each entry in the use table identifies the use history — from 
most to least recently used — of the group entries in the corresponding 
cache line. With each reference, the use entry for the line must be updated. 
But instead of containing complex computational logic, the hardware has a 
refill table that supplies new use entries as a function of the previous use 
history of a given line and the group entry currently being accessed in the 
line. Following power up the program must initialize the use logic by giv- 
ing this instruction 128 times to load every 3-bit location in the refill table. 



WRFIL 



Write Refill Table 



(BLKO APR,) 



• 70010 


/ 


X 


Y 



12 13 14 



17 IS 



35 



Load the refill data given by bits 18-20 of E into the refill table location 
specified by bits 27-33. 13 



REFILL TABLE OATA 
1 1 


! 1 1 1 1 






REFILL TA8LE ADDRESS 
1 1 I 1 




• 




18 19 20 


' 21 


22 


22 


24 


25 


26 


27 


23 


29 i 30 31 


32 


33 


34 


35 



After filling the refill table by stepping through locations 0-177 (val- 
ues of E that are multiples of 4 from to 774), the program should give an 
SWPIA to invalidate the indeterminate initial contents of the cache. Dur- 
ing the sweep the normal monitoring of cache access by the use logic ini- 
tializes the use table from the refill table. The way the use table gets set up 
depends on the data pattern — the "refill algorithm" — loaded into the 
refill table, and the pattern selected depends on the use strategy desired for 
the cache. To limit cache use to a single quadrant, simply load the quadrant 
number (0-3) into the entire refill table. The usual use strategy is to allow 
equal use of all quadrants and to start with a presumed use history of most 
to least recently used corresponding to the numerical order of the quad- 
rants. To implement this strategy, 14 load the following data pattern. 



13 The refill locations are selected by bits 27-33 to make use of the same lines that supply 
group numbers to address entries in the use table. 

14 For information on refill algorithms for other use strategies, refer to' the writeup of 
MAINDECIO-DDQDA-L-D(SUBRTN). 
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3.3 TOPS-10 Paging and Process Tables 
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Paging 

All of memory both virtual and physical is divided into pages of 512 words 
each. The virtual memory space addressable by a program is 512 pages; the 
locations in virtual memory are specified by 18-bit addresses, where the left 
nine bits (18-26) specify the page number and the right nine (27-35) the 
location within the page. Physical memory can contain 8192 pages and 
requires 22-bit addresses, where the left thirteen bits (14-26) specify the 
page number. The hardware maps the virtual address space into a part of 
the physical address space by transforming the 18-bit addresses into 22-bit 
addresses. 15 In this mapping the right nine bits of the virtual address are 
not altered; in other words, a given location in a virtual page is the same 
location in the corresponding physical page. The transformation maps a 
virtual page into a physical page by substituting a 13-bit physical page 
number for the 9-bit virtual page number. The mapping procedure is car- 
ried out automatically by the hardware, but the page map that supplies the 
necessary substitutions is set up by the kernel mode program. Each word in 
the map provides information for mapping two consecutive pages with the 
substitution for the even numbered page in the left half, the odd numbered 
page in the right half. 

The pager contains two 13-bit registers that the Monitor loads to spec- 
ify the physical page numbers of the user and executive process tables. To 
retrieve a map word from a process table, the pager uses the appropriate 
base page number as the left thirteen bits of the physical address and some 
function of the virtual page number as the right nine bits. For example, the 
entire user space of 512 virtual pages at two mappings per word requires a 
page map of just half a page, and this is the first half page in the user 
process table. Thus locations 0-377 in the table hold the mappings for 
pages and 1 to 776 and 777. To find the desired substitution from.the 9-bit 
virtual page number, the hardware uses the left eight bits to address the 
location and the right bit to select the half word (0 for left, 1 for right). 

The executive virtual address space is also 256K, but the page map for 
it is in three parts. The map for the first 112K (pages 0-337) is in executive 
process table locations 600-757. The map for the second half of the virtual 
address space uses the same locations in the executive process table as are 
used in the user process table for the user map (locations 200-377 for pages 
400-777). The map for the remaining 16K in the first half of the executive 
virtual address space is in the user process table, the mappings for pages 
340-377 being in locations 400-417. This means the Monitor can assign a 
different set of thirty-two physical pages (the per-process area) for its own 
use relative to each user. Hence when switching from one user to another, 
the Monitor need change only the user process table, this single substitu- 
tion making whatever change is necessary in the executive address space 
for a particular user. 



15 For paging purposes page has only 496 locations using addresses 20-777, as addresses 
0-17 reference fast memory, which is unrestricted and available to all programs. (In 
general a user cannot reference the first sixteen storage module locations in his virtual 
page 0.) Throughout this discussion it is assumed that all references are to storage. 
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Figure 3.1: TOPS-10 Virtual Address Space and Process Table Layout 
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Figure 3.2: TOPS-10 Process Table Configuration 
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Figures 3.1 and 3.2 show the organization of the virtual address spaces, 
the process tables and the maps for both user and executive. The first 
illustration gives the correspondence between the various parts of the ad- 
dress spaces and the corresponding parts of the page maps. The second 
illustration lists the detailed configuration of the process tables as deter- 
mined by the hardware. Any table locations not used are reserved for fu- 
ture use by the hardware or for use by the Monitor for software functions. 
Note that the numbers in the half locations in the page map are the virtual 
pages for which the half words give the physical substitutions. Hence loca- 
tion 217 in the user ^aaa man con tain s the physical page numbers for 
virtual pages 436 and 437. 

Although the virtual space is always 256K by virtue of the addressing 
capability of the instruction format, the Monitor usually limits the actual 
address space for a given program by defining only certain pages as accessi- 
ble. 16 The Monitor also specifies whether each page is public or not, writ- 
able or not, and cacheable or not. The cache bit has an effect only if cache 
use is enabled as the current cache strategy (§3.2); in this case a 1 in the 
cache bit allows loading the cache for the physical page when referenced as 
this particular virtual page, whereas a limits cache use to look but do not 
load. Each word in the page map has this format to supply the necessary 
information for two virtual pages. 









DATA FOR EVEN VIRTUAL PACE 






DATA FOR ODD VIRTUAL PAGE 


A 


P 


W 


S 


C 


PHYSICAL PAGE 
ADDRESS BITS 14-16 


A 


P 


w\s 


c 


PHYSICAL PAGE 
ADDRESS BITS 14-26 



) 2 1 4 5 17 IS 19 20 II 22 23 35 



Bits 5-17 and 23—35 contain the physical page numbers for the even and 
odd numbered virtual pages corresponding to the map location that holds 
the word. The properties represented by Is in the remaining "page use" bits 
are as follows. 

Bit Meaning of a 1 in the Bit 

A Access allowed 

P Public 

W Writable (not write-protected) 

S Software (not interpreted by the hardware) 

C Cacheable 

Page Table. If the complete mapping procedure described above were 
actually carried out in every instance, the processor would require two 
memory references for every reference by the program. To avoid this, the 



16 There is no requirement that the accessible space be continuous — it can be scattered 
pages. The convention however is for the accessible space to be in two continuous virtual 
areas, low and high,, beginning respectively at locations and 400000. The low part is 
generally unique to a given user and can be used in any way he wishes. The (perhaps 
null) high part is a reentrant area, which is shared by several users and is therefore 
write-protected. 
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pager contains a page table, in which it keeps a large assortment of map- 
pings for both the executive and the current user. In a manner analogous to 
the way the cache is organized to handle word groups of four,- the pager 
handles mappings in sets of eight. A page set is eight consecutively 
numbered pages beginning with one whose number is a multiple of 10 8 . 
Each page set consists of those pages whose mappings are contained in a 
single word group-in the page map. The 512 locations in the page table are 
contained in sixty-four lines, each of eight locations holding the mappings 
for the eight pages of a set. The lines are identified by the possible page-set 
numbers in an address space, 0-77, and the individual locations are ac- 
cessed by means of the virtual page numbers, 0-777. Each location has a 
parity bit and the complete mapping (i.e. map half word) for the virtual 
page that identifies it, including the physical page number and the five 
page use bits. Associated with each line are a bit that indicates whether the 
specified page set is in the user or executive address space, and a bit that- 
indicates whether the set of mappings is valid or not (it is not suitable to 
clear a line as zero is a perfectly valid mapping, albeit for an inaccessible 
page). The user and validity bits for ail lines collectively constitute the 
page table directory. 

When the program references a page contained in a page set whose 
mapping entry is tagged as valid and in the program address space, the 13- 
bit physical number from the mapping location for the virtual page is used 
as the left thirteen bits in the physical address for the memory reference 
(provided of course that the reference is allowable according to the A, P and 
W bits). If however the mapping set is invalid or is not for the correct 
address space, the pager makes a memory reference (referred to as a "page 
refill cycle") to get the word group containing the mapping for the specified 
virtual page from the page map. Even when there is no. cache, all eight 
mappings from the word group are entered into the page table, filling and 
validating the line for the page set. This means the mappings will also be in 
the table for subsequent references to pages in the same set, although some 
may require a trap to the Monitor to make them accessible. 

Note that all the mappings in an entire line of the page table are for a 
single space, user or executive. Since most programs are written beginning 
at page (and often page 400 for a pure part), a mechanism is built into the 
table to avoid excessive refills due to switching between user and executive. 
In the numbers actually used to select lines in the table, the value of ad- 
dress bit 19 is inverted in user address space. For a given page number, this 
causes a difference of 200 in the line selection number for user space as 
against executive space. Suppose the executive uses pages 0-37 and 
400-437, and also uses the per-process area, pages 340-377. Then if the 
user is limited to pages 0-137, 240-577 and 640-777, no conflict will ever 
occur between them in the page table. 



Page Failure 

When for any reason the pager is unable to make a desired memory refer- 
ence, an event known as a "page failure" occurs. For this the" pager termi- 
nates the instruction immediately, without disturbing PC or storing any 



KL10 System Operations 3-23 



results in memory or the accumulators, and executes a page fail trap. 17 The 
trap operation makes use of three locations in the user process table: it 
places a page fail word in location 500, identifies the failed state of the 
processor by placing the current PC word in location 501, and sets up the 
flags and PC according to a new PC word in location 502. The processor 
then resumes operation in the new state at the location now addressed by 
PC. The page fail word supplies this information. 



FAILURE | \ v 
lYrt. [ r 



VIRTUAL ADDRESS 



ff 1 



5 6 7 8 " 18 3S 



)V 



c 



IF BIT 1 ISO. BITS 1-7 
HAVE THIS FORMAT 



2 3 4 5 6 7 



Whether the violation occurred in user or executive address space is indi- 
cated respectively by a 1 or in bit 0; and a 1 or in bit 8 indicates whether 
or not a virtual address was given for the reference. If bit 1 is 1, bits 6 and 7 
are indeterminate, and the number in bits 1-5 (s= 20) indicates the type of 
"hard" failure as follows. 

21 Proprietary violation — an instruction in a public page has attempted 
to reference a concealed page, or a public program has attempted to 
fetch an instruction from a concealed page at an illegal entry point 
(one not containing a PORTAL). The failure for an illegal entry 
(which forces bit 8 to 0) occurs at the next reference, after the instruc- 
tion is decoded, so the fail address is meaningless. 

22 Page refill failure — this is a hardware malfunction. The pager found 
no mapping for the virtual page in the page table, so it refilled the 
line from the page map but still could not find it. 

23 Address failure — this is caused by the satisfaction of an address 
condition selected by the program. It is used for debugging purposes, 
such as to find an instruction that is maliciously wiping out a memory 
location, and is explained in §3.5 with the description of the DATAO 
APR, instruction that sets it up. Bit 8 is forced to by this failure. 

25 Page table parity error — the pager has encountered a page table 

mapping with incorrect parity. 
36 AR parity error — the processor has detected incorrect parity in a 

word read into AR from a storage module, the cache, or the E bus, and 
• has saved the word with correct parity in AC 0, block 7. When the 

source is a storage module, the MB Parity Error flag is also set (CONI 

APR, bit 27). 



17 A page failure that occurs during an interrupt instruction does not act this way. Instead 
it places a page fail word in AC 2, block 7, and sets the In-out Page Failure flag (CONI 
APR, bit 26), requesting an interrupt on the level assigned to the processor. 
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37 ARX parity error — the processor has detected incorrect parity in a 
word read into AEX from a storage module or the cache, and has 
saved the word with correct parity in AC 1, block 7. When the source 
is a storage module, the MB Parity Error flag is also set (CONI APR, 
bit 27). 

If the failure is not one of these, then bits 1-7 have the format shown 
above, where A, W, S, P and C are simply the corresponding bits taken from 
the mapping for the page specified by bits 18-26, and T indicates the type 
of reference in which the failure occurred — for a read-only reference, 1 
for any reference involving writing. The type of reference per se implies 
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reference was being made. Of course T being 1 in conjunction with W being 
certainly implies the cause of failure. 

For a page fail trap, the new PC word is set up by the Monitor to 
transfer control to kernel mode. After rectifying the situation, the Monitor 
returns to the interrupted instruction, which starts over again from the 
beginning or from the stopping position in a multipart instruction. Even a 
two-part instruction that has been stopped by a failure in the second part is 
redone properly, provided the Monitor restores First Part Done. The mecha- 
nism for making a correct return and the effects it produces on a BLT are 
the same as for an interrupt, and are described under the special consid- 
erations given at the end of §3.1. 

Note that a soft failure 18 seldom implies that anything is "wrong" — 
unless a program has attempted to write in a truly write-protected area. 
Consider a typical case where the Monitor has, for example, ten or twenty 
pages of a user program in core; these would be the virtual pages indicated 
as accessible. When the user attempts to gain access to a page that is not 
there (a virtual page indicated in its mapping as inaccessible), the Monitor 
would respond to the page failure by bringing in the needed page from the 
disk, either adding to the user space or swapping out a page the user no 
longer needs. 

The same situation exists for writability. When bringing in a user 
program, the Monitor would ordinarily indicate as writable only the buffer 
area and other pages that will definitely be altered, distinguishing those 
that must be revised on the disk at the end from those that can be thrown 
away by setting the software bit. Then in response to a write failure, the 
Monitor makes the page writable and sets the software bit to indicate to 
itself that that page has in fact been altered and must be saved. When the 
user is done, the Monitor need write back onto the disk only those pages for 
which both W and S are set. 



18 In a soft page failure or page table parity error, the line containing the mapping for the 
page is invalidated on the assumption the Monitor will change it. When the instruction is 
restarted, the pager must go to the page map to get new information for the table. 
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The Map Instruction 

It is often helpful for the Monitor or a debugging package to be able to 
determine how the pager would respond to a particular reference without 
actually chancing a page failure. It may also be useful to determine where 
a particular virtual page is in physical memory, e.g. to set up a channel 
command list. For such purposes the processor has this instruction, which 
unlike all other instructions described in this chapter, is not an 10 instruc- 
tion even though it is subject to the same restrictions. 



MAP 



Map an Address 



2 57 



89 



121314 



n is 



3S 



If the pager is on and the processor is in kernel or user 10 mode, map the 
page number of the virtual effective address E and place the resulting 
physical address and other map data in AC. The information loaded into 
AC for a true mapping is of the form 



£/0 



AM? 



00 



PHYSICAL ADDRESS 



ZJ 



0123456789 



13 14 



35 



where bits 14-26 are the physical page number the pager supplies for E, bit 
is 1 or depending on whether the paging is done in user or executive 
address space, and A, W, S, P and C are the page use bits from the mapping 
as explained above. If however there is a parity error in the page table 
entry, or the paging is done in user mode public but the page, while accessi- 
ble, is private, AC receives 



U 



l-AILURK 
I , TYPK 



00 



PHYSICAL ADDKKSS 



1 



5 6 7 8 9 



1.1 14 



1 

35 



The failure code can be only 21 or 25 for a proprietary or parity error, 
where in the latter case those bits supplied by the mapping, 6, 7 and 14-35, 
are meaningless. 

This instruction cannot be performed in a user program unless User In- 
out is set, nor in a supervisor program. Instead of mapping the address, it 
executes as an MUUO. If the pager is off, the result is undefined. 

Notes. The instruction itself cannot fail because it does not actually 
reference memory: it just translates the address and gets other mapping 
data. However the effective address calculation could fail, and getting the 
mapping may require a refill, in which a hard failure could occur. 
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In order properly to manage memory, the kernel program must select the 
kind of paging and the cache strategy, set up process tables and page maps 
for itself and the various users, oversee the operation of the page table, and 
select the fast memory block to be used by each program (usually block 
for itself). At any given time, accumulator, index register and fast memory 
references are made to that AC block that is assigned as "current." Given a 
particular processor mode (user or executive, public or private) and an ap- 
propriate process table and page map, the Monitor effectively defines the 
address space for a process (which may be itself) by specifying the base 
address for the process table and selecting the current AC block. 

When a user program calls the Monitor it is usually to request some 
activity, which may often require the executive to gain access to the user 
address space. To facilitate the crossover from one address space to another, 
the same instruction through which the Monitor assigns its own current 
AC block also "allows assignment of an AC block and section for the "previ- 
ous context" — i.e. the context of the process that made the call. These 
quantities, together with flags that indicate the mode of the caller, allow 
execution of instructions in the previous context (more about this subject 
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later). At any point in time, the previous context is essentially the circum- 
stances in which the previous process was running. Note that the previous 
context need not be the user; the same techniques can be exploited" follow- 
ing a call from one level of the Monitor to another. 

For initial setup, the kernel program must be cognizant of certain fun- 
damental characteristics that can vary from one system to another. For this 
purpose the instructions for basic management include not only those that 
address the pager, but also one that addresses the processor to discover 
what those characteristics are. 

I^o device code for the pager is 010. mnemonic PAG. 33 



APRID 



Arithmetic Procesor Identification 



12 13 14 



17 18 



JS 



Read the microcode version number, the processor serial number, and a 
listing of the fundamental characteristics of the system into location E as 
shown. 



MICR0C00E OPTIONS 



TOPS-JO .EXTEKOED. EXOTIC , 
PW1NC A00RESS >CO0E | 



• 



MICROCODE VERSION NUMBER 



10 



13 



14 



• HAROWARE OPTIONS- 

EXTENOE0 MASTER, 
50 Mi | CACHE |CHANNSll «U0 I OSC ] 

18 19 20 ' 21 22 



23 



PROCESSOR SERIAL NUMBER 



24 



25 



26 



27 



26 



29 



30 



31 



32 



15 



16 



33 



34 



35 



. the microcode implements paging for the TOPS-20 Monitor; indi- 
cates TOPS-10 paging. 

1 The microcode handles extended addresses. 

2 The microcode differs in some way from the standard version. 
18 Line power frequency is 50 Hz rather than the standard 60 Hz. 

21 The processor is an extended KL10; indicates a single-section KL10. 
The microcode options must of course be consistent with the processor 

' type. 

22 The system has a master oscillator, which is available as an external 
clock source. In a system containing MOS memory, the software must 
select this source (CPU clock source 2) from the PDP-11. 



33 BLKI PAG, is unassigned and executes as an MUUO. 
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CONO PAG, Conditions Out, Pager 
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Set up the system-oriented characteristics of the pager according to the 
effective conditions E as shown. 



CACHE 
STRATEGY 

LOOK | LOW 



T0PS-2Q 
PAGING 



ENABLE 
PAGER 



EXECUTIVE 3ASE A00RESS (PAGE NUMBER) 



18 



19 



20 



21 22 23 ' 24 25 25 '27 28 29 ' 30 31 32 ' 33 



34 



35 



Load bits 23-35 into the executive base register to select' the executive 
process table. If bit 22 is 1 enable overflow trapping and enable the pager 
for the type of paging selected by bit 21: 1 TOPS-20, TOPS-10. The 
paging selected must be the same as that implemented by the microcode as 
indicated by APHID bit 0. A in bit 22 prevents traps and disables paging 
so all memory references are to physical locations unpaged. 34 

CAUTION 

Paging can be disabled only for executive mode. A user mode 
program will not run correctly unless the pager is turned on. 

Select the cache strategy according to bits and 1 as follows: 
Ox Disable the cache. 

10 Look for all references, but do not load physical references; for virtual 
references act as directed by the cache bit in the mapping for the 
page. 

11 Make complete use of the cache for physical references; for virtual 
references act as~ directed by the cache bit in the mapping for the 
page. 

Invalidate the entire page table by setting the invalid bits in all lines. 
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Read the system status of the pager into the right half of location E. The 
information read is the same as that supplied by a CONO. 



34 Note that disabling the pager does not mean there can be no page failures, as these can be 
caused by conditions having nothing to do with paging, i.e. with translating virtual to 
physical addresses. 
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DATAO PAG, Data Out, Pager 
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Set up the process-oriented elements of the pager according to the contents 
of location E as shown. 



SELECT 

"AC 
BLOCKS 



SELECT 

rfitViOiw 

CONTEXT 
SECTION 



L0A0 

iiSEn 

BASE 
ADDRESS 



I 



CURRENT 
AC BLOCK 



PREVIOUS 

rnuTFTT 

AC" 8LQCK 



10 



PREVIOUS CONTEXT 

SECTION 



16 



17 
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19 



20 



21 



22 



23 



24 



25 



26 



USER 8ASE ADDRESS (PAGE NUMBER! 
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30 



31 



32 



33 



34 
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Bits 0-2 are change indicators for parts of the data word: when a bit is 0, 
the corresponding part of the word is ignored, and the equivalent value 
supplied by a previous DATAO remains in effect. 

If bit is 1, select as the current and previous context AC blocks those 
specified by bits 6-8 and 9-11, respectively. If bit 1 is 1, select as the 
previous context section that specified by bits 13-17 (which must be zero in 
a single section processor). If bit 2 is 1, perform these functions: 

If bit 18 is 0, update the user accounts as explained in §3.6. 

Load bits 23-35 into the user base register to select the user process 

table. 

Invalidate the entire page table by setting the invalid bits in. all lines. 



DATAl PAG, Data In, Pager 
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Read the process status of the pager into location E. The information read 
is in the same format as that supplied by a DATAO (bits 0-2 are Is and bit 
18 is 0). Note however that only the AC block designations and user base 
address are necessarily the same information supplied by a previous 
DATAO. When an MUUO stores its own context as given by the DATAO 
that set up the process containing it, it changes the designation of the 
previous context section to that in which the program is currently running. 
Hence following a call by an MUUO, a DATAI PAG; in the called program 
will see as the previous context section that specified by PC at the time the 
MUUO was performed. 
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CLRPT 



Clear Page Table Entry 
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X Y 



12 13 14 



17 18 



3S 



TOPS-20 



Invalidate the page table map- 
ping entry for the page refer- 
enced by E. 



TOPS-10 

Invalidate the page table line 
(eight entries) containing the 
mapping for the page referenced 
byE. 



. At power turnon the contents of the cache and page table are indeter- 
minate, the processor is in kernel mode, paging is disabled, the cache is off, 
and the current AC block is by default. After the front end loads the 
microcode, it then loads the initializing kernel program. This program, 
running unpaged in physical memory, should give an APRID to determine 
system characteristics and an SWPIA to invalidate the cache. The unpaged 
program ends with a CONO PAG, that selects the cache strategy, selects 
and enables paging, specifies the executive base address, and invalidates 
the page table. From this point the kernel program runs paged and must 
set up the first user or users, loading the user process tables and page maps, 
bringing in whatever parts of user programs and data that are consistent 
with good working-set management, and setting up the timing and ac- 
counting meters. Finally the Monitor gives a DATAO PAG, to assign the 
base address and current AC block for the first user, and then transfers 
control to the user program via an XJRSTF or JRSTF. The initial DATAO 
PAG, should have a 1 in bit 18 to inhibit updating accounts before any user 
has run. 

On a call from the user via an MUUO, give a DATAI PAG, to deter- 
mine the context of the user, i.e. his AC block and section. Then give a 
DATAO PAG, that assigns block as current for the Monitor, assigns the 
user AC block and section as previous context for accessing user space, but 
leaves the base address alone so the right paging is still available for such 
access. To return to the same user, reassign the AC block without changing 
the base address. Leaving the base address alone also avoids unnecessary 
updating of user accounts. Note that on the transfer to a user program no 
previous context values need be given as the user cannot employ PXCTs. 
For switching from one user to another, give a DATAO PAG, that updates 
the first user's accounts in his process table, as specified by the old base 
address, and then loads a base address for the new user. The transfer to a 
user is done with a JRSTF or XJRSTF; the latter also restores the previous 
context section when used to return from a higher to a lower level within 
the executive. 

The usual procedure for administering AC blocks is to assign some to 
individual user programs on a semipermanent basis for special applications 
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and to assign block 1 to all other users. 35 In this way the Monitor need not 
store their blocks when the special users are not running, and it need not 
store block 1 when it takes control from an ordinary user temporarily. If 
the Monitor shared block with any users, it would have to store the user 
accumulators even when taking control only temporarily. When switching 
from one ordinary user to another, the Monitor usually stores the first 
user's accumulators- in his process table or shadow area — this is locations 
0-17 in user virtual page 0, an area not generally accessible to the user at 
all — and loads the new user's accumulators from his process table or 
shadow area, where they were stored after the last time the new user ran. 
On a change from one process to another the entire page table must be 
invalidated, but this is done automatically by the instruction that assigns 
the new user base address. If the system uses shared or indirect pointers, or 
several virtual page numbers point to the same physical page, then the 
table must be invalidated whenever a page is removed - from memory or a 
.pointer is removed from a user section table or page map. On the other 
hand deletion of a page with a unique mapping requires only that a CLRPT 
be given to invalidate the line containing it. In multiprocessor operation all 
page tables must be cleared whenever one is. CST entries can be used to 
communicate paging information from one processor to another. 

Previous Context Execute 

Ordinarily an instruction in a user program is performed entirely in user 
address space, and an instruction in the executive program is performed 
entirely in executive address space. But to facilitate co mm u ni cation be- 
tween Monitor and users, the executive can execute instructions in which 
selected references cross over the boundary between user and executive 
address spaces. This feature is implemented by the previous context exe- 
cute, or PXCT, instruction. The mnemonic PXCT is for convenience only 
and has no meaning to the assembler, it is used simply to indicate an XCT 
with nonzero A bits. A PXCT is an XCT. Although the PXCT is given by a 
program in the -current context, some of the references made by the exe- 
cuted instruction can be in the previous context. A PXCT can be given only 
in executive mode, but the previous context may be the user, as following a 
call to the Monitor by the user. The previous context can however be the 
executive, to allow communication between one level of the executive pro- 
gram and another, as when the Monitor gives an MUUO to itself. (Note: it 
is not intended that PXCT be used by the Monitor for unsolicited references 
to a user program.) 

It is very important to understand just which operations are affected by 
a PXCT and which are not. The only difference between an instruction 
executed by a PXCT and an instruction performed in normal circumstances 
is in the way certain of its memory and index register references are made. 
To work-as a PXCT, an XCT must be given in executive mode, and the bits 
in its A field (9-12) must not all be (in user mode A is ignored. But there 
is otherwise no difference in the way the XCT itself is performed: every- 
thing in the PXCT is done in the current (executive) context, and the in- 



35 It may be worthwhile to assign a separate AC block for the sole use of interrupt routines. 
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struction to be executed by the XCT is fetched in the current context. More- 
over in the executed instruction, all accumulator references (specified by 
bits 9-12 of the instruction word) are in the current context. (Remember 
that the executive can always access a user accumulator simply by address- 
ing it as a fast memory location.) If the instruction makes no memory 
operand references, as in a shift or immediate mode instruction, and it has 
no indexing or indirection (i.e. the instruction word gives E directly), then 
its execution differs in no way from the normal case. The only difference is 
in memory and index register references. 

The previous context is specified by four quantities. Following a call by 
an MUUO the section in which the calling' nrngram was running (its PG 
section) and the fast memory block assigned to it appear as the previous 
context section and current context AC block in the word read by a DATAI 
PAG,. For the called program, these two quantities can then be assigned as 
the previous context by a DATAO PAG,. The current AC block of the call- 
ing program also appears in the process context word supplied by the 
MUUO. Various levels of the Monitor may all use fast memory block 0; or a 
separate block may be assigned to that part of the Monitor that uses PXCTs 
in handling MUUO calls from other parts of the Monitor. 

Just as the current mode is indicated by the User and Public flags, the 
mode in which the calling program was running is indicated by Previous 
Context User and Previous Context Public. 36 At a call these flags may be 
set up automatically or they may be set up by a flag-PC doubleword or a PC 
word. Note that the restrictions on references made in the previous context 
are those of the previous context — not those of the context in which the 
PXCT is given — with the single exception that if the current program is 
r unn ing in section' 0, the previous context is also limited to section 0. Sup- 
pose the executive executes an instruction that references the concealed 
user area. Such a reference would fail if Previous Context Public were set. 

Which referencesvin the executed instruction are made in the previous 
context is determined by Is in the A portion of the PXCT instruction word 
as follows. 

Bit References Made in Previous Context if Bit is 1 

9 Effective address calculation of instruction, including both instruc- 
tion words in EXTEND (index registers, address words by indirec- 
tion); also EXTEND effective address calculation of source pointer if 
bit 11 is 1 and of destination pointer if bit 12 is 1 

10 Memory operands specified by E, whether fetch or store (e.g. PUSH 
source, POP or BLT destination); byte pointer; second instruction 
word in EXTEND 

11 Effective address calculation of byte pointer; source in EXTEND; ef- 
fective address calculation of EXTEND source pointer if bit 9 is 1 



36 Previous Context User and Previous Context Public are in the same flag bits that are 
used for User In-out and Overflow in user mode. The former has no meaning in executive 
mode, and the latter is not really necessary as the executive program is not ordinarily 
interested in performing extensive mathematical procedures. 
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12 Byte data; stack in PUSH or POP; source in BLT; destination in 
EXTEND; effective address calculation of EXTEND destination 
pointer if bit 9 is 1 

Previous context referencing is useful and reasonable in some instruc- 
tions but inapplicable to others. There is no trap of any kind, and the effect 
of using the feature with an instruction to which it does not apply is simply 
undefined.' 



Inapplicable 

LUUO, MUUO 

AOBJN, AOBJP 

JUMP, AOJ, SOJ 

JSR, JSP, JSA, JRA, JEST 

PUSHJ, POPJ 

XCT, PXCT 

Shift-rotate 

String (except MOVSLJ) 

10 



Applicable 

Move, XMOVEI 

EXCH, BLT, XBLT 

Half word, XHLLI 

Arithmetic 

Boolean 

Double move 

CALCAM 

SKIP, AOS, SOS 

Logical test 

PUSH, POP, ADJSP 

Byte 

MOVSLJ (extended KL10 only) 

MAP 

Note that no jumps can use previous context referencing. Even among 
the instructions to which such referencing is applicable* only a limited 
number of the sixteen possible bit combinations is useful or meaningful. 
Doing an effective address calculation in the previous context (selected by 
bit 9 or 11) makes sense only if the corresponding data access is also in the 
previous context (as selected by bit 10 or 12 except 11 or 12 in EXTEND). 
Only these combinations are permitted. 



Instructions 



9 10 .11 12 



References in Previous Context 



General 



Immediate 



10 Data 

110 E, Data 



NOTE 



An A of 1000 is the "correct" configuration for a PXCT of an immediate 
mode instruction, but it inadvertently allows use of the current context 
section rather than the previous context as would be desired in say the 
PXCT of an XHLLI. To get the previous context section in the extended 
KL10, use 1100 instead. 



BLT 



1 

10 

10 1 

110 

110 1 



Source 

Destination • 
Source, destination 
E, destination 
E, source, destination 
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XBLT 1 Source 

1 Destination 

11 Source, destination 

Stack 1 Stack 

1 Memory data 

10 1 Memory data, stack 

110 E, memory data 

110 1 E, memory data, stack 

Byte 1 Data 

11 Pointer E, data 

A 1 1 7 Dm-m+a* Mm^ V Ant-n 

1111 £, pointer, pointer E, data 

MOVSLJ 1 Destination 

(extended KL10 only) 10 1 E (= Y), destination pointer, destination 

10 Source 

1 1 E (= Y), source pointer, source 

11 Source, destination 

1 1 1 E {= Y"), pointers, source, destination 

Execution of a BLT by a PXCT is limited to these three cases: 

Where ail operations, regardless of context, are in section 0. 

Where the previous context fast memory block is being saved in or 
restored from the current context, which may be any section. (But re- 
member that regardless of context a BLT-given in-section address in the 
range 0-17 always refers to fast memory. Hence an AC block can never 
be saved in or restored from the first sixteen storage locations in any 
section.) 

Where all operations are confined to a single section in the previous 
context, as would be the case when clearing a user page. 

In all other circumstances XBLT must be used instead. 
Address Debugging 

The address failure, or address break, feature of the pager implements the 
traditional program debugging technique of catching a particular type of 
memory reference to a selected location (it does not catch fast memory 
references). It may be used to determine whether a given program is modi- 
fying a particular location, is executing a particular piece of code, or is 
simply using a particular block of data. This instruction uses the processor 
device code to specify the circumstances in which a break shall occur. 



DATAO APR, Data Out, Arithmetic Processor 
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Select the break address and the break conditions according to bits 9-35 of 
location E as shown (a 1 in a condition bit selects the condition indicated, a 
makes no reference selection or selects the opposite address space). 
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REFERENCE TYPE 
FETCH | READ | WRITE 


USER 
SPACE 








V.' 9 10 


11 

r 


12 > 




RESERVED 


CONDITIONS 


BREAK ADDRESS 



12 13 



35 



10 



■11 

12 



1 _.TTVI 



The break conditions selected by Is in bits 9-12 are as follows. 

9 A normal fetch of an instruction in the program under control of 

Any reference that reads except the normal fetch of an instruction. 
This includes retrieval of operands, address words in an effective ad- 
dress calculation, or an instruction to be executed by an XCT or user 
LUUO. 

Any reference that writes. 

A reference made in user virtual address space (0 selects executive 
space). The break mechanism operates only for virtual address space. 
It does not catch microcode physical references, such as to the process 
tables. 



Whenever the processor attempts one of the selected types of reference 
to the location specified by the break address in the selected virtual address 
space, a page failure results 37 unless the Address Failure Inhibit flag is set. 
This flag, which is bit 8 of the program flags and can be set only by an 
instruction that restores them, prevents an address failure during the next 
instruction — the completion of the next instruction automatically clears 
it. If an interrupt or trap intervenes, the flag has no effect and is saved and 
cleared if the flags are saved with PC. If it is not saved, it affects the 
. instruction following the interrupt or trap. Otherwise it affects the instruc- 
tion following a return in which it is restored with PC. Using the inhibit 
flag, the Monitor can return to a user instruction that caused an address 
failure and "get by it." 

Since this feature is entirely under the control of the above 10 instruc- 
tion, it can be used quite flexibly for the executive to debug its own 
routines, or to debug a single user program without bothering either the 
executive or other users. The break conditions in effect at any time can be 
ascertained by giving this instruction. 



37 Executive conditions also catch virtual references in interrupt functions, but the page 
failure sets the In-out Page Failure flag instead of resulting in a trap for an address 

failure. 
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Read the current break conditions into bits 9-12 of location E. The informa- 
tion read is the same as that supplied by the last DATAO, (Note that the 
break address cannot be read.) 
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The processor includes a subsystem with elements for keeping track of 
time, use of system facilities, and use of individual system features. One 
element is a standard 12-bit interval counter that is set up by the program 
to interrupt when the count reaches a preset value. The others are meters 
for keeping a 59-bit count, wherein only the low order sixteen bits are 
implemented in hardware. In each case the actual counting is done in a 16- 
bit hardware counter, while the overall count is kept in a doubleword in a 
process table. A count is updated from its counter by a procedure that is 
performed periodically by the microcode and whenever appropriate to an 
operation requested by the software. In the update procedure the contents 
of a. counter are added into the corresponding count and the counter is 
cleared. Whenever the microcode checks for interrupt requests it updates 
any count whose counter is more than half full, i.e. whose MSB is 1. The 
current user accounts are generally updated when the Monitor switches to 
a new user. 

A doubleword count is a 59-bit unsigned quantity whose format and 
relationship to the hardware counter are as shown here; The entire first 
word comprises the high order thirty-six bits, and the low order twenty- 



EVEN NUMBERED WORD 



ODD NUMBERED WORD 



HIGH ORDER PART OF COUNT 





LOW ORDER PART OF COUNT 


RESERVED 







35 1 
36 




23»24 
S8| 

1 
1 




35 




COUNTER 







43 SS 

three are in bits 1-23 of the second word. 38 Reserving bits for expansion at 
the low order end guarantees format compatibility with future machines 
that may be much faster (and therefore require bits for counting smaller 
time units). Altogether there are four meters that use this counter- 
doubleword format. One is a straightforward time base that counts at 1 
MHz. Two keep track of process execution time and number of memory 
references for purposes for user accounting. Last is a mechanism for analyz- 
ing system performance by investigating the use of individual system fea- 



38 Remember, it is a property of twos complement arithmetic that the sign can be used as an 
extra magnitude bit in an unsigned number. But since the hardware is set up for signed 
arithmetic, bit of any lower order word must be skipped. 
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tures, either by counting the number, of times particular events occur or 
measuring the duration of time particular procedures are in progress. 

The program controls the various subsystem elements through two sets 
of 10 instructions using device, codes 20 and 24, mnemonics TIM and 
MTR. 39 In general the meter code is for handling the accounting meters and 
the timer code is for the other elements, but the MTR conditions are for 
both. Data instructions read updated doubleword counts, but affect neither 
the counts nor the counters. Condition bits (in a CONO) directly affect only 
the 16-bit hardware counters. Of course a counter being enabled does mean 
updating of the doubleword count will probably occur. But to reset a count, 
the program must not only clear the hardware counter but separately c^ear 
the corresponding pair of locations in the process table. 

System Timing 

For regular system use, the processor provides a time base and an interval 
' counter. The time base is a doubleword count (of the type described above) 
kept in locations 510 and 511 of the executive process table. It counts 
elapsed time in microseconds (a rate of 1 MHz). Drift is guaranteed to be 
less than 5 seconds per day for at least the first six years of use. To main- 
tain day-to-day accuracy, the Monitor can reset the time base once each day 
from the line frequency clock in the front end processor (although a line 
frequency clock has quite low resolution, it has very high long-term accu- 
racy.) ,-.,/% 

The interval counter is a 12-bit hardware counter that counts in 10 ixs 
increments (100 kHz). It can therefore count, and signal completion of, any 
interval from 10 \ls to 40.95 ms; and it can also be read at any time to 
determine how long some particular operation or procedure has taken. The 
counter can be used for any purpose by the software, but it is employed 
principally to signal the Monitor should a user tie up the system too long. 
Associated with the counter are two flags, Interval Done and Interval Over- 
flow. Done sets when the counter reaches the value the program specifies as 
its period or reaches its maximum (all Is); Overflow sets only if the counter 
' reaches its maximum without ever matching its period. 40 Setting Done re- 
quests an interrupt on the level assigned to the counter, and the processor 
responds by executing the instruction in location 514 of the executive proc- 
ess table. 



CONO MTR, Conditions Out, Meters 



70260 



12 13 14 17 18 " 

Assign the interrupt level specified by bits 33-35 of the effective conditions 
E and perform the functions specified by bits 18-26 as shown. 

39 Unaligned instructions using these codes are DATAO TIM,, BLKO MTR, and DATAI 
MTR,. They execute as MUUOs. 

40 Overflow can occur only'if at some time during the count, the program changes the period 
to a value less than the current counter value. 
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3.8 Error and Diagnostic Instructions 

The first part of this section explains the instructions through which the 
software handles the error flags and identifies the source of a hardware 
error. The second part discusses a special instruction the Monitor uses to 

i . _ il j.„™ nn J +« rro+ Aiacmne+ir* anA rnnfiemrntion informa- 

tion directly from individual memory controllers. The objective of this 
treatment is to complete the definition of all KL10 instructions and to give 
the programmer what he needs to identify sources of hardware error for 
purposes of software recovery. For information on diagnosing equipment 
ills, the reader must turn to maintenance documents. Note that this section 
doe's not touch on diagnostic functions the front end can execute in the 
KL10 without the KL10 microcode running; that subject is treated in the 
maintenance documentation. 

Error Monitoring and Investigation 

A few hardware errors — specifically a parity error in the page table or in a 
word brought into AR or ARX from memory — are detected by the pager 
and produce a page failure. Other hardware errors detected in the processor 
or on the S bus are indicated by flags that can request an interrupt on a 
level assigned to the processor. Several of these flags also lock information 
about the bad reference into the error address register ERA. The program 
can read this register, and it continues to hold the same information, even 
should subsequent errors occur, until the flag that locked it is cleared. 

The error conditions are generally regarded as important enough to be 
assigned^ the highest priority level. Howeyer for conditions that may be 
associated with user instructions (a parity error or unanswered memory 
reference), the common practice is for the error interrupt to switch over to 
the lowest priority level by means of a program-set request. Then the time 
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taken to handle the situation, which may well be considerable, cannot in- 
terfere with high priority events. . . ,. 

Error flags are handled by two condition 10 instructions that address 
the processor, which has device code 000, mnemonic APR. 44 These instruc- 
tions also handle the sweep flags for the cache (§3.2). The instruction that 
reads ERA uses the interrupt device code. 
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Assign the interrupt level specified by bits 33-35 of the effective conditions 
E and perform the functions specified by bits 19-31 as snown (alma bit 
produces the indicated function, a has no effect). 



CLEAR ENA3LE I DISABLE I CLEAR I SET 
ALL 
1N-0UT SELECTED FLAGS 

DEVICES I . [ | I 



19 



20 



I 



21 



22 



22 



S BUS 

EH BOB 



SELECT FUGS FOR BITS 20-23 

IN-OUT 



NO 

MEMORY 



PAGE 

FAILURE 



24 



25 



26 



MB 

PARITY 



CACHE ADDRESS 
OIBCTPTJ PARITY 



2T 



28 



29 



POWER 
FAILURE 



SWEEP 
DONE 



30 
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A 1 in bit 19 generates the 10 reset signal, which clears the control 
logic in all of the peripheral equipment (but affects none of the internal 
devices, such as the pager or the processor flags). 

Bits 20-23 select flag functions: Is in these bits produce the indicated 
effects on the processor flags selected by Is in bits 24-31. A 1 m bit 20 
enables the setting of any selected flag to request an interrupt on the.level 
assigned to the processor; a 1 in bit 21 disables the selected flags from 
requesting interrupts. Similarly a 1 in bit 22 or 23 clears or set s the s e- 
lected flap. The result of putting Is in both bits 20 and 21 or 22 and 23 is 

m ^oteTsetting flags has of course no relation to what the flags repre- 
sent; the function is used only to check out the flag logic. 
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Read the status of the processor error and sweep flags into location E as 
shown (asterisks indicate bits that can cause interrupts). 



PRIORITY 

INTERRUPT 
ASSIGNMENT 



33 34 



35 



« The processor device code is also used in several instructions for the pager and the cache. 
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6-13 A 1 in any of these bits indicates that setting the listed flag will 
request an interrupt on the level assigned to the processor by bits 
33^35 of the CONO. ' 

19 The cache is currently undergoing a sweep. 

24 A storage controller has signaled the processor that' it has detected 
an error in its own operation or in information it has received over 
the S bus or from one of its storage modules. If the type of error is 
not identified by there also being a 1 in bit 25, 27 or 2S, then the 
condition is either an incomplete cycle or a parity error in data sent 
to the memory (all data received by memory is written, even if 
bad). Controller flags for some of these conditions can be read by 
the diagnostic instruction discussed in the second part of this sec- 
tion. 

25 The processor attempted to access a memory that did not respond 
within a preset time. This time is 68 ms on an extended KL10, 82 us 
on a single-section KL10. The setting of this flag locks information 
about the attempted reference into ERA. Since a nonexistent mem- 
ory supplies zero data, on read this error should be accompanied by 
a 1 in bit 27. 

26 A page failure has occurred in an interrupt instruction, or a word 
with even parity has been received at AR from the E bus (the latter 
can be recognized only if the transmitting device generates a parity 
bit). An interrupt failure caused by an address break sets' this flag 
instead of producing an address failure (§3.5). 

NOTE 

A page failure in an interrupt instruction is regarded 
as a fatal error, and causes an interrupt instead of a 
page failure trap. The kernel program is expected to 
set up the interrupt instructions so that a software 
page failure simply cannot occur. 

27 The buffer (MB) in memory control has received a word with even 
parity. The setting of this flag locks information about the refer- 
ence into ERA. 

28 A physical page number with even parity has been encountered in 
the cache directory. The setting of this bit turns off the cache, and 
it remains off until the flag is cleared by giving a CONO APR, with 
Is in bits 22 and 28. 
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29 A storage controller has signaled that it has received an address 
with even parity from the processor. The parity check actually en- 
compasses both the address and the control signals that accompany 
it on the S bus. The setting of this bit locks information about the 
attempted reference into ERA. 

30 Ac power has failed. The program should save PC, the flags, mode 
information and fast memory in storage, update the accounting 
meters, validate the entire cache, and halt the processor. Note that 
PC may point to an interrupt routine rather than the main pro- 
gram. After power is restored the front end must reboot the system, 
and the Monitor must reestablish the operating environment (§3.5). 

31 A cache sweep has been completed. 

32 Some processor flag is currently requesting an interrupt, i.e. some 
flag in bits 24-31 is set and has been enabled to interrupt as indi- 
cated by a 1 in the corresponding position in bits 6-13. 
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Read Error Address Register 
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Read the contents of the error address register into location E. If No Mem- 
ory, MB Parity Error or Address Parity Error is set, ERA contains informa- 
tion about the reference corresponding to the first of those flags to be set as 
shown. 



WORO 
NUMBER 



REFERENCE IDENTIFICATION 
SWEEP |ChANNEl| DATA I SOURCE [ WHITE 



INOETEHMINATE 



10 



HIGH ORDER 
AODRESS SITS 



PHYSICAL AODRESS OF FIRST WORD OF TRANSFER 



I I 



16 19 



Bits 0-1 and 14-35 identify the physical location of the reference in which 
the error occurred. Bits 14-35 are the address of the specific memory refer- 
ence made by the program or whatever. If the reference required only a 
single transfer, that address is the error address. But if the reference 
triggered a group transfer, bits 14-35 are the address of the first reference 
chronologically in the group, and bits and 1 give the number of the word 
on which the error actually occurred. Note that word numbers are in physi- 
cal, not chronological, order. 

Information given in bits 2-6 identifies the reference. A 1 in bit 2 or 3 
respectively means the reference was made for a cache sweep or a channel 
transfer. Bit 6 indicates the memory function being performed for the refer- 
ence, where the read and write parts of a read-pause-write are separately 



20 I 21 22 23 ' 24 25 26 ' 2T 26 , 29 ' 30 31 32 ' 33 



34 



35 
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indicated by and 1. Bits 4,. 5 and 6 together identify the source of the data 
for the transfer or attempted transfer (on write the word is always going to 
storage). 

Bits 4-5 Source withQ in bit 6 Source with I in bit 6 

■ GO Storage for any read or read-pause-write Channel status 

Ql ■ Channel data 

10 AS 

refill 



ERA. retains the same information until the program clears the locking 
flags by giving a CONO APR,2260P. Of course only flags that are set actu- 
ally need be cleared, and the routine that responds to errors should consider 
and clear all set flags. To facilitate diagnosis from the front end, the master 
reset does not clear ERA. Hence if need be, the front end can give diagnos- 
tic functions that reset the KL10 and then read ERA. 

The processor includes provision for forcing bad parity to check the 
error detection logic. Bits 18-20 of a CONO PI, (§3.1) respectively cause 
even parity to be generated for an address sent to memory, a data word 
available from AR, and a page number entered into the cache directory. 
Where the data error shows up depends on where the word is sent from AR. 
Which errors are being forced can be seen by checking the flap in the same 
bits-ofaCONIPI,. 

Programming Cautions. When handling parity error or nonexistent 
memory interrupts, the programmer should beware of the following. 

• An incorrect word from memory to AR or ARX can result in both a page 
failure and an interrupt. In general the page fail trap to the Monitor can be 
expected to occur slightly ahead of the interrupt. 

• Should an error flag be set while another interrupt request is being 
processed, the system would handle the lower priority interrupt before get- 
ting to the processor interrupt. This means PC may be pointing to a lower 
level interrupt routine rather than the program level at which the error 
occurred. Remember that during request processing, the interrupt system 
is otherwise static and the program continues. 

• Even without inadvertent interference from another level, it is quite 
likely the processor will perform one or perhaps two more instructions be- 
tween the time the error flag sets and its interrupt starts. Hence even 
though PC is at the correct program level, it may well be pointing to the 
first or second instruction following the one in which the error occurred. 

• A processor error interrupt that switches over to a lower priority level 
should not return to the interrupted program, as the error may simply 
recur, producing a second processor interrupt before the error-handling in- 
terrupt for the first.- This could happen because PC is actually pointing to 
the offending instruction, but beyond that, one error often begets another 
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— consider the case of- PC counting into a nonexistent memory. In any 
event, it is generally not worthwhile to return to any program without first 
finding out what went wrong. 

S Bus Diagnostic Cycle 

Ordinarily the S bus is used for the processor to reference memory. But the 
S bus also has a diagnostic cycle that allows the processor to communicate 
with the memory controllers rather than to access a particular location. 
The diagnostic cycle is initiated by the processor giving a special instruc- 
tion that sends a function word to a controller and receives a word of error 
and diagnostic information back from it. 



SBDIAG S Bus Diagnostic Function (BLKO PI,) 
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Send the contents of location £ as a function word over the S bus to the 
controller specified by bits 0-4, and read the return word for the function 
from that controller into location £+1. Which function a word represents is 
indicated by its code in bits 31-35. 
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CHAPTER 1 
INTRODUCTION 



The DECsystem-10 scheduler provides several response levels for 
long-term, short-term, and high-priority computing needs. Also, 
numerous scheduling parameters and two different modes of operation 
provide flexibility in the scheduling policy. 



1.1 OPERATION MODES 

There are two ways to operate the scheduler: Ro'und Robin 1 mode and 
Class Scheduler mode. 

Round Robin mode means that each job in the long-term processor queue 
(called PQ2) receives an equal share of the resources of the system. 
In other words, each job receives the full attention of the system for 
a short interval called a time slice. When the time slice allotted to 
a job expires, the job goes to the back of the~ queue. Tnen, the full 
attention of the system turns to the next job in the queue. 

Round Robin mode gives good turnaround time to small jobs even though 
there are large jobs in the system. In addition, it gives each job an 
equal chance to use the system resources. Each job receives its 'fair 
share 1 of the system. Therefore, no job, regardless of its makeup, 
can take over the system. 

Class Scheduler mode means that each job in PQ2 receives a share of 
the resources of the system. However, unlike Round Robin mode, each 
of these shares is not necessarily equal. Instead, each job in PQ2 is 
assigned to a class for which the system administrator sets a quota of 
system resources. The higher the quota, the more often the class is 
scanned for scheduling and swapping. In Class Scheduler mode, all 
jobs in PQ2 are also stored in a set of subqueues by class. As jobs 
expire their time slices, they go to the back of PQ2 and to the back 
of the subqueue for their class. This action gives Round Robin 
operation within the classes. Alsc, each class is swapped in and 
scheduled depending on its class quota. 

The class quota consists of a primary percentage and a secondary 
allocation. The primary percentage is the amount of resources 
allotted to the class. The secondary allocation is the amount of 
leftover resources allotted to the class. Leftover resources occur 
when some of the classes do not use all of their primary percentages. 



1 Kleinrock, L. , "Timeshared Systems: A Theoretical Treatment,' 
Journal of the ACM, Vol. 14, No. 2, 1967, pages 242-261. 
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The system administrator may define any one of the classes as a 
background batch class. Background batch jobs do not run unless there 
are no runnable jobs in any of the other classes. Although normally 
the background batch class has a zero primary percentage and a zero 
secondary allocation, this is not a restriction. In fact, the primary 
percentage and the secondary allocation have the same meaning for the 
background batch class as for any other class. A nonzero primary 
percentage forces background batch jobs to run a certain percentage of 
the time. In addition, a nonzero secondary allocation gives 
background batch jobs a proportion of the leftover time. 



1.2- OBJECTIVES 

The overall design objectives of the scheduler are listed in the 
following. 

1. Provide for sharing computer time among jobs with long-term 
computing needs. 

2. Provide fast response time for interactive jobs. 

3. Provide very fast response time for real-time jobs. 

4. provide for efficient use of all of the system resources. 

Objective 1 above applies to jobs with long-term computing needs. For 
example, 

• Compilation of FORTRAN, COBOL, ALGOL, and BASIC programs 
« Execution of mathematical and statistical programs 

• Execution of programs for sorting, merging, and/or file 
storage and retrieval 

Objective 2 applies to jobs that require fast response time for 
interactive jobs. For example, 

• A user editing a file 

• A user updating a database 

In this case, each time the user ends a line sending his input to tne 
system, he expects to receive a response within a matter of seconds 
(preferably 1 to 4 seconds) . 

If the scheduler must complete a full cycle through PQ2 before 
responding, it cannot reliably achieve this optimum 1- to 4-second 
reponse time. The time required to make a complete cycle through PQ2 
can depend on the character of the jobs in the queue, and does depend 
on the number of jobs in the queue. On a heavily loaded system, the 
response time can easily exceed 10 seconds. Clearly, this response 
time is unacceptable to the interactive user. Therefore, the 
scheduler provides a priority processor queue called PQ1 . 

Normally, the scheduler selects jobs in PQ1 before it selects any jobs 
in FQ2. In this way, the scheduler can meet the goal of fast response 
time without wasting CPU time and without allowing those jobs that ao 
not reauire interactive resoonse to suffer. 
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Jobs enter PQI at the back of the queue and are assigned a time 
interval to remain in the fast-response queue. If a job ends before 
its time slice is exhausted, it leaves the processor queues. If a job 
does not finish its task Before its time slice is exhausted, it goes 
to the back of PQ2. Thereafter, until the job ends, it receives the 
same attention as any PQ2 job. 

Typically, there are more jobs in PQi and PQ2 than can fit in memory 
at" any one time. Therefore, some of the jobs must be stored 
temporarily on a high-speed swapping device, such as a disk or a drum. 
As "the jobs that are in memory are requeued, they become eligible to 
be swapped out. Then, as space becomes available in memory, jobs are 
swapped into memory by scanning the processor queues and swapping in 
the" highest priority" job that is not already in memory. Jobs in PQI 
receive priority over jobs in PQ2. This is true both for swap-in and 
for allocation of resources once they have been swapped in. 

Objective 3 applies to jobs that require very fast response time and 
better performance than PQI. For these jobs, the scheduler provides a 
final set of processor queues called high-priority queues. There may 
be up to 15 high-priority queues, called HPQ1 through HPQ15. 

The kinds of jobs that would use the high-priority queues are, for 
example, 

• Card-reader and line-printer spoolers 

• Seal-time data acquisition 

These programs must be swapped to disk when the physical devices are 
not busy. This action provides more room for other terminal jobs. 
When there are cards to read or lines to print, these jobs must De 
swapped in as soon as possible and remain in memory while in service. 
Also, these jobs must be able to get CPU attention instantaneously to 
fill and empty buffers of input and output. The scheduler achieves 
very fast swap-in and instantaneous access to the CPU by swapping in 
and* scheduling resources {such as CPU, and so forth) for all HPQs 
ahead of PQI and PQ2. 

Jobs in high-priority queues can require any amount of system 
resources up to and including 100% of the system. Whatever resources 
remain unused are then available for jobs in PQI and PQ2. In the 
example of the card-reader-stacker and the line-printer-spooler jobs, 
a certain amount of memory is dedicated to these jobs when they are 
active. Therefore, the amount of user memory area available to all 
other jobs is correspondingly reduced. 

As far as CPU time is concerned, jobs in high-priority queues are I/O 
bound and, therefore, use very little. Because of this, most of the 
CPU (over 95%) is availaDie for other user jobs. 

Objective 4 applies to all joos. The scheduler runs the system as 
efficiently as possible within the constraints imposed by the first 
three objectives, including 

• Balancing the percentage of CPU versus I/O jobs in core memory 
so that multiprogramming is most effective 

• Balancing the percentage of PQI versus PQ2 joos in memory so 
that a good compromise is achieved between throughput and 
short-term response 
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CHAPTER 2 
OVERVIEW OF SCHEDULER OPERATION 



All jobs in the system are maintained in a master set of queues. Each 
job is in one and only one of the queues. For convenience, the master 
set is divided into two logical groups: processor queues and 

long-term wait queues. 



2.1 PROCESSOR QUEUES 

The processor queues are the high-priority queues (HPQs) , PQ1, and 
?Q2. Each of these is described in the following. 

HPQs (Up to 15 levels, called HPQ1 through HPQ15) contain jobs 
that require real-time response, such as the 
line-printer-spooler and the card-reader-stacker programs. 

PQ1 Contains jobs that require fast response, such as those 
that conversationally interact with* the user. 

?Q2 Contains jobs that require long-term computing, such as 
those that compile FORTRAN, COBOL, ALGOL, and BASIC 
programs. For the class scheduler, all jobs in PQ2 are 

also in the class subqueues. 

Jobs in the processor queues either are ready to execute on the 
processor or are in various short-term wait states (such as waiting 
for disk I/O or I/O from other high-speed devices) . These short-term 
wait states are too small to warrant requeueing the job, because it 
would then lose its position in the processor queues and be marked for 
swap-out. A wait-state code indicates which jobs are runnable and 
which are waiting. Table 2-1 lists the wait-state codes. 
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Table 2-1 
Wait-State Codes 



Code 



Meaning 



IOW I/O wait for unit record, reader, printer, and so forth. 

DIOW Disk I/O wait (RP02, RP03, RP04, and so forth). 
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file structure. 

MQ Waiting for monitor buffer (to read file retrieval 
pointers, for example) . 

DA Waiting for system interlock to clear to access SAT 
table to get an allocation of disk blocks on the file 
system. 

C3" Waiting for system interlock to clear to access core 
block allocation routine (to get space for a DOB or file 
access table from the core block pool, for example). 

Di,D2 Waiting for DSCtape controller. 

DC Data controller wait. 

CA Core allocation (lock) wait. 

PIOW Paging I/O wait. 

PS Paging I/O satisfied. 

2V Execute virtual-memory wait. 

NAP Short-term sleep. 



2.2 LONG-TERM WAIT QUEUES 

Table 2-2 lists the long-term wait queues. 
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Table 2-2 
Long-Term Wait Queues 



Queue 
CMQ 



SLPQ 



JDCQ 

STOPQ 

NULQ 

EWQ 



Meaning 

Command Wait Queue. You have typed a monitor command that 
cannot be executed until the job is in memory, and the job 
is not in memory. This produces a higher priority 
swap- in, then requeues to PQl. 



•Jff ■ ' 
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waiting for the device to print output already sent to it. 
This includes pseudoteletypes. 

Sleep Queue. The job has executed the SLEEP monitor call 
and requested to sleep for some interval, or it has 
executed the HI3ER monitor call and requested to sleep 
until the WAKE monitor call is executed by another job, or 
some specified condition has been satisfied. 

DAEMON Wait Queue. The job is waiting for service by 
DAEMON (for example, to record accounting data or to 
perform error logging) . 



Stop Queue. The user has typed a CTRL/C, for example, 
stop his job. 



to 



Null Queue. All job slots must be accounted for in the 
queue structure. This queue contains the numbers of the 
job slots not currently in use (including jobs that have 
CORE zeroed) . 



Event Wait Queue, 
for examDle. 



Waiting for a magnetic tape controller, 



within priority wait queues, the jobs are ordered by priority. 

The first job in the queue has the highest priority. In the long-term 
wait queues, the order of the jobs is immaterial. 

The master queues (including the subqueues) are each separated into 
two mutually exclusive lists: one for jobs that have core (JBTADR ¥ 
0) and one for jobs that do not have core. Tnis significantly reduces 
overhead, because various scans use only one set of queues. For 
example, the scheduling scans do not look at jobs that do not have 
core. 



2.3 SPECIAL QOEDES 

A number of special queues are used to improve communication between 
the scheduler and the swapper, and to properly handle background batch 
jobs. Jobs in the special queues are also in the master queues. 
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Table 2-3 
Special Queues 



Queue Meaning 

JIL Queue of PQ2 jobs that have just been swapped in, that is, 

those jobs that have not yet expired 1 time slice since 

they were swapped in. The queue is divided into two 

lists; one for timesharing jobs and one for background 
batch jobs. . 

OLS Queue of PQ2 jobs that are eligible to be swapped out. 

That is, those jobs that have expired at least 1 time 

slice. The queue is divided into two chains: one for 
timesharing jobs and one for background batch jobs. 



2.4 TIME SLICE 

The time slice controls the movement of jobs within the processor 
queues. The time slice is defined as two separate parameters: 
quantum runtime and in-core protect time. Quantum runtime is 
decremented as the job uses the processor. In-core protect time is 
decremented whether or not the job uses the processor, as long as the 
job has been scanned by the scheduler. 

CPU-bound jobs generally expire quantum runtime. I/O-bound jobs 
generally expire in-core protect time. When either parameter expires, 
the job is considered to have ended its time slice. 

The time slice is assigned when a job is swapped in or when it 
initially begins to run. It is reassigned whenever a job is requeued 
to a new position in the processor queues. 

Within its time slice, a job may enter and leave various short-term 
wait states without being requeued to a new position in the queues. 
Requeues in and out of short-term wait involve only a change in the 
wait-state code; no queue transfer takes place. 

Jobs that block to any long-term wait state are physically requeued to 
one of the long-terra wait queues. They lose their place in their 
current processor queues and are not eligible to be swapped in or 
scheduled until they leave the long-term wait state. However, their 
positions in the long-term wait queues are immaterial. Jobs become 
runnable and leave the long-term wait queues according to tneir 
individual job characteristics. Most jobs are requeued to the back of 
PQ1. 

Jobs in the processor queues that expire their time slices are 
requeued to the back of the processor queues. Primarily, the queue 
that the job is currently in determines its destination and the queue 
assignment of a new time slice. 



2.4.1 PQ2 Time Slice, Round Robin Mode 

In Round Rooin -:cae, tne ?Q2 time slice gives each job in " succession 
an equal opportunity to use the system resources. 
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PQ2 jobs are kept in two chains. Jobs that are in memory are in the 
in-core chain. Jobs that have been swapped out are in the out-core 
chain. Both chains are ordered lists, with the highest priority jobs 
at the front of the chain. 

When the jobs are swapped in, they are assigned a time slice and are 
linked to the back of the in-core chain. As jobs are scheduled and 
expire their time slices, they are requeued to the back of the in-core 
chain. At this point, they become eligible to be swapped out. Jobs 
that have been swapped out go to the back of the out-core chain. Jobs 
that have been swapped in come from the front of the out-core chain. 
This action allows proper Round aobin cycling between the two chains. 

Jobs that have, not yet expired 1 time slice are kept in a special list 
in the order in which they were swapped in. They are scheduled to run 
ahead of jobs waiting to be swapped out. This is consistent with the 
Round Robin algorithm, and provides the best short-term response time. 

Jobs are swapped out in the order in which they expire their first 
time slices. As they expire, they are placed in the" swap-out list and 
are removed from the just-swapped- in list. 

While the jobs are waiting to be swapped out, they cycle around the 
in-core chain in Round Robin fashion. The jobs are assigned a time 
slice and, as they expire it, they are requeued to the back of the 
in-core chain. When there is no demand for swapping, core scheduling 
around the in-core chain results. 



2.4.2 PQ2 Time Slice, Class Scheduler Mode 

In the Class Scheduler mode, the jobs in PQ2 are given an opportunity 
to use system resources in proportion to the size of their class 
quotas. The time slice allows Round Robin cycling within a class. 

All jobs in PQ2 are also stored in a set of subqueues by class. The 
subqueues are ordered lists, with the jobs of the highest priority at 
the front of the subqueue. Like PQ2,,the subqueues have in-core and 
out-core chains. 

When jobs are swapped in, they are assigned a time slice. As jobs are 
scheduled and expire their time slices, they are requeued to the back 
of the PQ2 in-core chain, and to the back of the in-core chain of the 
subqueue for their class. They are then eligible to be swapped out. 

Jobs that have been swapped out go to the back of the PQ2 out-core 
chain and to the back of the out-core subqueue chain for their class. 
Jobs that have been swapped in come from the front of the subqueue 
out-core chain. This allows Round Robin cycling within the subqueues. 

The order in which the subqueues are scanned for swap- in depends on 
the primary percentage and the secondary allocations defined by the 
system administrator. The swapper operates with a 100- interval swap 
cycle. At each interval, one of the classes (that is, subqueues) is 
the first one scanned for swap-in. The number of times a class is 
scanned first depends on the size of its primary percentage. Tnat is, 
a class with a primary percentage of 10% will be the first one scanned 
in 10 out of 100 intervals. 
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If no jobs are eligible to be swapped in from the primary class, the 
swapper selects one from a secondary class. The choice of secondary 
class depends on the size of the class's secondary allocations. The 
larger a class's secondary allocation, the higher its probability of 
selection. If the secondary class selected also has no jobs, another 
selection is made from the remaining secondary classes. If no jobs 
are found in the primary and secondary classes, the swapper considers 
a background batch job scan. 

Background batch jobs can only be swapped in at a certain rate to 
prevent thrashing. The system administrator specifies this rate 
through the SCDSET program. (See Section 6.5.) 

If jobs exist in sufficient numbers in all classes, the swapping 
algorithm fills memory with jobs in proportion to their primary 
percentages. This allows the scheduler to schedule accurately while 
still achieving good short-term response times. 

Jobs that have not yet expired 1 time slice are kept in a special 
queue in" the order in which they were swapped in. To guarantee a 
minimum level of short-term response, the list of jobs just swapped in 
must be scanned for scheduling a certain percentage of the time. The 
response fairness factor determines the amount of time that the list 
of jobs just swapped in is scanned. The system administrator sets the 
response fairness* factor with the SCDSET program. 

The class scheduling scan is made up of LOO intervals. Tne 

microscheduling parameters define the length of these intervals. The 

system administrator sets the microscheduling parameters with the 
SCDSET program. 

Each time that the microscheduling interval expires, the scheduler 
moves to the next class in the primary scan table. The table contains 
100 entries, each representing the primary class for chat interval. 
The scheduler builds a complete suoqueue scan table with all classes 
bv starting with the primary class and selecting the second, third, 
."..,- nth ciass, depending on the size of the secondary allocations. 
The scan table determines the order in which the subqueues are scanned 
throughout the current microscheduling interval. 

Jobs are swapped out in the order in which they expire their first 
time slice. As they expire, they are placed in the swap- in list and 
are removed from the just-swapped-in list. 

while waiting to be swapped out, the jobs cycle around the in-core 
chain for their subqueue in a Round Robin fashion. They are assigned 
a time slice and, as they expire it, they are requeued to the back of 
their subqueue in-core chain. When there is no demand for swapping, 
this results in class core scheduling. 



2.4.3 PQ1 Time Slice 

PQ1 jobs that expire their time slices are requeued to the back of 
PQ2. They are reassigned the normal PQ2 quantum runtime and tney 
retain whatever in-core protect time they have remaining. They are 
not marked to' be swapped out. This allows jobs in PQ1 to have very 
good response for a short period of time. Thereafter, if they 
continue" to run, they may remain in core at least as long as a ?Q2 
job. 
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PQ1 jobs are ahead of PQ2 jobs in the normal swap- in and scheduling 
scans. To prevent PQ1 jobs from totally taking over the system, there 
is a set of swapping and scheduling fairness counts. This means that 
when a PQ1 job has been selected a certain number of times in a row, 
the fairness counts force PQ2 to be scanned first. 



2.4.4 HPQ Time Slice 

HPQ jobs that expire their time slices are requeued to the back of the 
corresponding HPQ. If they expire their quantum runtimes, they are 
assigned new quantum runtimes and retain whatever in-core protect 
times they have remaining. If they expire their in-core protect 
times, they are assigned new quantum runtimes and in-core protect 
times. They are then eligible to be swapped out. 

The HPQ time slice defines how quickly the system can switch from one 
HPQ job to another. HPQ quantum runtime is, therefore, a very small 
number of ticks. HPQ in-core protect time is not very meaningful 
because only another HPQ job can force an HPQ job to be swapped out. 
It is unlikely that any installation would have more HPQ jobs in 
execution at once than could fit in memory. 



2.5 SCHEDULING SCAN AND ASSIGNMENT OF SHARABLE RESOURCES 

The scheduling scan searches the processor queues (in the order of 

priority) for a job to run. Then, it selects the first runnable job 

it finds in the scan. Jobs with a zero short-term wait-state code are 

runnable. So are jobs waiting for sharable resources, if the 

resources are currently availabls. To assign a resource, the 

scheduler clears the job's short-term wait-state code and marks the 

resource in use. This procedure causes sharable resources to be 
assigned to the job with the highest priority. 

If a high-priority job needs a resource held by a low-priority job, 
the scheduler will attempt to run the lower priority job until it 

gives up the resource. This feature is especially important in the 
class scheduler. 

Refer to Chapter 3 for a detailed description of the scheduler. 
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This chapter describes the scheduler at the level of the macro code. 
If you require only general knowledge of the scheduler, it is not 
necessary that you read this chapter. The labels referenced in this 
chapter are in the scheduler monitor module, SCHED1. 

This chapter discusses the following issues. 

• Jobs that perform GETSEGs release their high segments with 
their low segments still in memory. The swap-in scan must 
search the in-core chains to link to the new high segments. 

• The class scheduler can swap and schedule fixed classes by 
having a zero secondary allocation. The class scheduler can 
also perform fixed swapping with nonfixed scheduling. 

• Background batch imposes some complexities on the swap-in, 

swap-out, and scheduling scans. 

• The swap-out scan selects jobs in the long-term wait queues 
ahead of jobs in the processor queues. 

• The scheduling scan has a number of fairness counts that 
control the way in which the master queues and special queues 
are scanned. 



3.1 SCHEDULER ASSEMBLY 

The system administrator assembles the scheduler in one of two modes, 
depending on the value of the assembly switch FTNSCHED. When the 
system administrator sets FTNSCHED to 0, the Round Robin mode 
scheduler is assembled. 

In Round Robin mode, there are no scheduler classes and there is no 
SCHED. monitor call. When the system administrator sets FTNSCHED to 
-1, the Class Scheduler mode scheduler is assembled, which includes 
the code for the SCHED. 



3.2 CALLING THE SCHEDULER 

The scheduler is called into action when one of the following occurs: 

1. The clock ticks (an interval of l/60th of a second has 
elaosed) . 
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2. The current job becomes unrunnable for any reason (for 
example, long-term wait, short-term wait, or error) . 

3. The null job is running and some job becomes runnable (for 
example, finished with disk I/O). 

4. An HPQ job of higher priority than the current job becomes 
runnable. 

5. A job that has been chosen to be swapped out has just 
released all disk-sharable resources. 

The entry points for the scheduler are NXTJOB for CPCJQ , and NXTJ31 for 
CP01 . 



3.3 NXTJOB TO NXTJBX SECTION 

This section of code decrements the in-core protect times and requeues 
jobs when their in-core protect times expire. 

In-core protect times are maintained only when there are enough 
runnable jobs to require some of them to be swapped out. Whenever a 
specified period of time (SCDCOR) elapses during which no runnable 
jobs are 'swapped out, the scheduler assumes that core is not scarce 
and stops making decisions based on core use. 



In-core protect time is stored in the PDB word labeled 
Figure 3-1.) 
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Figure 3-1 In-core Protect Time ( .PDIPT) 



PDMSWP is the sign 
defined below. 

PDMSWP = 

PDMSWP = 1 



bit of the same word, which may be 



The job may not be swapped, 



or 



as 



The job may be swapped. 



In-core protect times are decremented every other clock tick unless no 
runnable jobs are being swapped out. This is done only on the odd 
ticks to save overhead. When core is not scarce, in-core protect 
times are not decremented at all (again, to save overhead) . Tne 
core-is-scarce timer (CORSCP) is decremented at this time. The 
parameters for assigning in-core protect time and CORSCD are scaled in 
units of 2 ticks. 

Jobs in the processor queues are not decremented unless they have been 
scanned by the scheduler. This prevents jobs from being swapped in 
and then swapped out again without having a chance to run. This would 
occur if a job were swapped in when one or two heavy CPU-bound jobs of 
higher priority were already in core. In this case, although the 
newly swapped job would be in core, it would not get a chance to run 
until these jobs had completed their time slices. 
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The table DCSCAN defines the set of queues to be scanned. This table 
contains an entry for each queue that is allowed to retain its in-core 
orotect time. These queues are listed below. 



EWQ 



SLPQ 



PQ2 



PQ1 



hpqs 



The queues are scanned from the back so that jobs being requeued 
cannot be requeued twice. PQ2 is scanned ahead of PQl for the same 
reason. 

At NXTJOB, if the clock has not ticked, go to NXTJB1. On the even 

tick, go to NXTJBX. On the odd tick, set up to scan the queues to be 

decremented. On every clock tick, decrement CORSCD, which is the 
in-core protect time. 

At NXTJ3L, if there' are no more queues to be scanned, go to NXTJBX. 
Otherwise, if there are no jobs in the next queue to be scanned, go to 
NXTJ3G. 



At NXTJBA, remember the successor to the job being scanned in case 
requeue. 



of 



If the job being scanned is in a processor queue but has not already 
been scanned, go to NXTJBF. Otherwise, clear the scanned-by-scheduler 
bit (JS.SCN) and decrement the in-core protect time CORSCD. 

If the in-core protect time is no longer positive, set the 
job-is-swappable bit (PDKSWP) . If the job is in command wait (CMWB=1) 
or is waiting for requeue (JRQ=1) , go to NXTJBD. otherwise, assign a 
new in-core protect time so the job will cycle. Tnen, if the job is 
in a processor* queue, requeue it with subroutine QXFER using transfer 
table* QTIME. Finally, go to NXTJBF. 



At NXTJBD, the in-core protect time is set 
deposit a new in-core protect time. 



to a zero. At NXTJ3E, 



At NXTJBF, pick up the remembered link to the next job. If the link 
is a job,*go to* NXTJBA. If the link is a queue header, go to NXTJ3L 
and scan the next queue. 



At NXTJBX, in Class Scheduler mode, execute the subroutine SCDQTA 
check for the end of the microscheduling interval. 



to 



3.4 NXTJB1 TO CXJBOA SECTION 

This section of code determines whether or not the current job has 
become unrunnable and checks for the end of the time slice when the 
quantum runtime has expired. 

If the current job is the null job, exit to CKJB1 to requeue all jobs 
with JRQ set. If the current job has executed an KPQ monitor call, if 



it is waiting for DAEMON (JDC=1 or. JS.DE?*1) , or 



if 



nas oeen 



requeued out of the processor queues, exit to CKJSOA to requeue the 
current job. 

If the current job is runnable, check rhe in-core protect and quantum 
runtime for expiration and requeue the job for time-slice expiration 
(subroutine QARNDT) , if required". Then, go to CKJ31. 
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If the current job is unrunnable, go to CKJBO to determine if the 
current job needs requeuing. Unrunnable is defined as any of the bit 
settings (in JBTSTS) listed in Table 3-1. 

Table 3-1 
Unrunnable Bit Settings 

Bit' Setting Meaning 

Job does not want to run 

Job number not assigned 

Job expanding 

Monitor detected error 

Waiting to shuffle or shuffling 

Waiting to swap or swapped out 

Job in wait state 

Requeue requested 

At CKJBO, mask out JXPN, SHF, and SWP. If the job does not need 
requeuing, exit to CXJB1. Otherwise, if the job does need requeuing, 
jump to CKJBO A. 



3.S CKJ30A TO CKJB5 SECTION 

This section of code requeues the current job and/or all jobs in the 
system with JRQ equal to 1. 

CKJBOA is entered if the current job needs requeuing. If the requeue 
bit is not set for the current job (JRQ » 0) , go to CXJ33 . 

At CKJB1, if entered by the slave processor, go to CKJ35. Otherwise, 
requeue all jobs in the requeue chain. The requeue chain is a 
last- in-first-out linked list (JBTJRQ) . The zero word is the header. 
Each entry contains the job number of the next job in the list. A 
zero entry indicates the end of the list. Sote that the entries are 
deleted from the list before JRQ is cleared, which prevents entering 
the same job in the list twice. 

At CKJB3 , the actual requeue is done by subroutine QRSQ. Loop back to 
CKJBl to requeue any remaining jobs. 



3.6 CKJB5 TO CKJB7 SECTION 

This section of code checks to see if exec virtual memory has become 
available (EVAVAL ? -1) , and if so, clears the wait-state code for any 
jobs waiting for it. The EV resource is required for all I/O devices 
that do not have data channels and, therefore, require that the 
monitor service routines be able to address user core with the EXEC 
page map. The resource being allocated is the EXEC page T ' a P slots. 

At CKJB5 , if entered from the slave, go to 5CHED. 
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3.7 CKJB7 TO SCH5D SECTION 

This section of code determines whether or not 5W&P/LOCX is called. 

SWAP/LOCK is called only on CPUO , and only if one or more of the 

1. An HPQ job on disk became runnable. 

2. The current job is the null job. 



3.8 SCHED TO QREQ SECTION 

This section of code selects the next job to run. It assigns sharable 
resources as required, and unwinds them from other jobs if necessary . 

At SCHED, clear the potentially lost time flag and go to the 
processor-dependent scheduler. 

For a single processor, go to SCHEDJ. 

For a dual-processor entered by the master, go to MSCHED in CP1SER. 
For a dual-processor entered by the slave, go to SSCHED in CP1SER. 
MSCHED selects a job from the slave waiting for a monitor call. 
Alternatively, if tnere are no such jobs or if the fairness count 
(.COOFC) is greater than OFCQ , it selects a job by the normal scan at 
SCHEDJ. S5CHSD is a call to SCHEDJ. 

At SCHEDJ, determine which scheduling scan table is to be used. If 
the scheduling scan did not reach the last queue in the scan recently 
enough (.CPSFC greater than or equal MFC), use the secondary scan 
table rather than the primary scan table. Table 3-2 contains the 
primary and secondary scan tables {in-core chains only). 



Taole 3-2 
Primary and Secondary Scan Tables 



Primary Scan Table 
(SSCAS) 



Secondary Scan Table 
(SSCAN1) 



Queue Routine 



Queue 



Routine 



HPQs IQFOR HPQS 

PQ1 IQFOR PQ2 

?Q2 IRRFOR (Round PQ2 

Robin mode) 
PQ2 ISSFOR (Class PQl 

Scheduler mode) 
PQ2 I3BF0R (Class PQ2 

Scheduler mode) 



IQFOR 

IRRFOR (Round Robin mode) 

ISSFOR (Class Scheduler mode) 

IQFOR 

I3BF0R (Class Scheduler mode) 



For the slave processor, the primary and secondary scan tables are 
interchanged. 
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If the swapper has selected a job to force out that has a 
disk-sharable resource (FORCEF not equal to 0) , try to run that job 
until it gives up all resources, regardless of the job's actual queue 
position. (The disk-sharable resources are: AD, CB, DA, and 
MQ.) Otherwise, at SCHDJ1 call QSCAN to scan the processor queues in 
the order specified by the previously selected scan table. Jobs 
returned by the scan are processed by the following code from SCHED8 
to SCHD1. 

At SCHEDB, call DXRUN to see if a job is runnable on the calling CPU. 
If not, loop to scan for the next job (JRST (T2) ) . 

Set the scan bit (JS.SCN) to 1. If the job has a zero wait-state code 
(meaning RUN) , go to SCHEDC. The code from this point to SCHEDC 
assigns and unwinds sharable resources, (They are also assigned in 
CLOCK1 if available the instant a job asks for them.) The sharable 
resources involved are: AC, MQ, DA, CB, Dl, D2, DC, and CA. 

Sharable resources are not assigned to jobs that are swapped out 
(SWP=1) , shuffling (SHF=1) , expanding (JXPN=1) , or need to be requeued 
(JRQ-1) « 

If the job needs a resource (identified by a wait-state code) and it 
is available (AVTBMQ?0) , go to SCHEDA and assign the resource. If the 
resource is not available, try to unwind it. 

The code from "JNWNOl to SCHEDA unwinds sharable resources from lower 
priority jobs so that they are available to higher priority jobs. The 
unwind process is to look for a job that has the resource that is 
desired* and, if runnable, to run the job until it gives up the 
resource. 

If the job holding the desired resource is not runnable because it 
also is waiting for a resource then: 

1. If that resource is available, assign it and run that job. 

2. If that resource is not available, look for the job that has 
that resource and repeat the unwind process (repeating to a 
maximum depth of 10, with an expected maximum of 3} . 

MQ represents a special problem because there is usually more than one 
monitor buffer. " This means that there is more than one path to 
success. The routine investigates all paths and chooses the shortest. 
A path to success always ends with a job that is runnable. 3y running 
that job until it gives up all resources and by repeating the process 
for each job in the path, the original objective of freeing a given 
resource for a higher priority job is eventually achieved. 

At SCHEDA, a job being forced out that had a disk-sharable resource 
(FORCEF=J) will not be given any new sharable resources after it has 
given up the resources that prevented it from being swapped (assessed 
by calling FLSDR) . Otherwise, if you know that the job is runnaole, 
go to SCHEDE and assign the resource. 

At SCHEDE, assign the resource by clearing the wait-state code for the 
job. Also, do bookkeeping on AVTBMQ and USTSMQ. 
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At SCHEDC, the job being scanned is checked to be sure it is runnable 
(normal definition) . In addition, if the job scanned is being forced 
out with a disk-sharable resource, it is considered unrunnable if it 
has given up the resource. If it has not given up the resource, the 
JXPN bit is ignored {as far as the job being runnable) oecause some 
other job nay have expanded a high segment being shared. 

If the job is runnable, it is selected to run. The scheduling 
fairness counts are updated depending on the queue the job is in. The 
selected job number is in AC J. The scheduler exits to CLQCK1. 
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chains of the processor queues until the lost-time flag is set (.CPPLT 
f 0) or there are no more jobs to scan. This allows computation of 
lost time, using a small amount of processor time that otherwise would 
not be used. Set J to the NOLJOB and exit to CLOCK1. 



3.9 QREQ TO QCHNG SECTION 

This section of code determines if a requeue requires a physical queue 
transfer, and if so, sets up the right half of AC U to the desired 
transfer table address (either directly or by indexing the QBITS cable 
with the wait-state code) . If no physical queue transfer is required, 
it performs the necessary bookkeeping for the requeue. 

At QREQ, if CMWB, JDC, and JS.DEP are not all zero, go to QREQi. 
Otherwise, if the run bit is off at QRSQO , go to QSTQPT. If none of 
the above special cases apply, call MSQRT to maintain dual-processor 
monitor call counts. Dispatch to one of eight different transfer 
routines using the left half of the QBITS table indexed by the 
wait-state code. 

At QREQI, if the command wait bit (CMWB) is set to 1 and the job is 
swapped out (SWP=1) or expanding (JX?N=1) , set AC U to transfer table 
QCMW~, and go to QXPER. 

At QREQ2, if the job is requesting service from DAEMON (JDC=1 or 
JS.DEP=1) and the job does not have a disk-sharable resource, set AC U 
to the state code JDCQ and go to QJDCT. Otherwise, go to QREQO . 

The QBITS table has one entry for each wait-state code. (See Table 
3-3.) The left half of each entry is the address of the transfer 
routine. The right half contains either the address of a transfer 
table {to be used by QXFER) or -1 if no physical queue transfer is 
required. 

The content of QBITS depends on the value of the assembly parameters. 

Table 3-3 is a typical configuration. 
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Table 3-3 
QBITS TABLE 



Code 



Left Half 



Right Half 



RN 

WS 

TS 

DS 

PS 

AU 

MQ 

DA 

CB 

Dl 

D2 

DC 

EV 

IOW 

TIOW 

DIOW 

PIOW 

SL? 

EW 

NAP 

NUL 

JDC 

STOP 



QRNT 

QWST 

QTST 

QDST 

QPST 

QAUT 

QMQT 

QDAT 

QCBT 

QD1T 

QD2T 

QDCT 

QEVT 

QIOWT 

QTIOWT 

QOIOWT 

QPIOWT 

QSLPT 

QEWT 

QNAPT 

QNOLT 

QJDCT 

QSTOPT 



QRNW 

-1 

QTSW 

-1 

-1 

-1 

-1 

-1 

-1 

-1 

-1 

-1 

-1 

-1 

QTIOWW 

-1 

-1 

QSLPW 

QEWW 

-1 

QNULW 

QJDCW 

QSTOPW 



The following six routines perform bookkeeping and, where required, 
set up the right half of AC U to the desired transfer table address. 

QRNT: Entry for jobs with zero wait state =» runtime. 

If the job is in PQ2 and is being requeued because it 
is changing subqueues (JS.CSQ = 1) , go to QREQX. 

Otherwise, go to QREQ3. 

Entry for paging satisfied. 

Entry for I/O wait satisfied. 

Entry for disk I/O wait satisfied. 

This routine checks to see if the job is in a processor 
queue, and if not, requeues it into PQ1 (QCHNG) . It 
then clears the wait-state code and gees to QREQX. 

QTST: Entry for teletype I/O wait satisfied. 

This routine clears the wait-state code and goes to 
QREQ3 . 

QSLPT: Entry for jobs entering sleep. 

QEWT: Entry for jobs entering event wait. 

QREQ3 : Common entry, various requeue procedures. 



QPSTt 
QWST: 
QDST i 
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QSTOPT: 



This routine sets up AC from the right half of QBITS 
(transfer table address) and calls the queue transfer 
routine (QXFER) . It then goes to QREQX. 

Entry for jobs that do not have run bit set. 

This routine sets U to STOPQ unless the wait-state code 
(in U) is NOLQ. Go to QREQZ. 



QNUL1 



Entry for jobs going to NULQ. 



QTIOWT: Entry for jobs going to TTY wait queue. 

QREQZ: Common entry point. 

This routine sets PDMSWP, indicating that the job may 
be swapped.. 

Go to QREQ3 (to finish requeue) . 

QREQ6: Common entry point, not labeled as such but includes 
NAP, all sharable resource waits, and all I/O waits 
except TTY. 

This subroutine checks to see if the job is in a 
processor queue, and if not requeues it to PQ1 (QCHNG) , 
then goes to QREQX. 

QREQX: Exit for all requeue subroutines. 

If the job being requeued is changing subqueues 
(JS.CSQ=1) and is still in PQ2, requeue it to the back 
of the appropriate subqueue with subroutine TOBACK. 

Exit from QREQ. 



3.10 SUBROUTINE QCHNG 

Requeue a job to the back of PQ1. This subroutine is used to transfer 
a job into the processor queues if it is in some other queue (STOPQ, 
for example) . It is required because the requeue logic for short-term 
wait states assumes that the job is already in the processor queues; 
however, in a few cases it is not. For example, a user types CTRL/C 
while in I/O wait and then later continues the job. 



3.11 SUBROUTINE SETIPT 

The SETIPT subroutine sets in-core protect time for the job to the 
minimum value, which is used after the expiration of the first time 

slice. (This may be different from values assigned at swap-in because 

initial quanta reflect the difficulty of swapping in large jobs. In 
this case, no swap-in has occurred.) 

The value for minimum in-core protect time is installation dependent. 

h reasonable range is from 0.5 second to 5 seconds. 
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3.12 SUBROUTINES ZERIPT, CLRIPT, CLRIP1 

Set PDMSWP to indicate that the job is eligible for swap-out. 

3.13 SUBROUTINE ASICPT 

Compute and store the in-core protect time based on the size of the 
job. 

No in-core protect time is assigned if CORSCD is less than zero. In 
this case, it is assumed to be unnecessary because there is probably 
sufficient core for all running jobs. 

3.14 SUBROUTINE TOBACX 

Requeue jobs to the back of PQ2 and the subqueue. 

3.15 SUBROUTINE QARNDT 

Requeue the job because the time slice has expired. 

If the job is currently in PQ1 , requeue it to the back of PQ2, and 
then assign a new quantum runtime. 

If the job is currently in PQ2, requeue it to the back of PQ2, and 
then assign a new quantum runtime and in-core protect time. Finally, 
mark the job eligible for swap-out. 

If the job is in HPQ, requeue it to the back of the same HPQ, and then 
assign a new quantum runtime. 



3.16 QXFER TO DICLNK SECTION 

This routine performs all of the physical queue transfers. It is 
called with a job number in AC J and the address of a transfer table 
in AC U. 

Transfer tables occur in two formats (depending on the right half of 
the first word) . 

1. Fixed-destination queue. 



POSITION OPTION 



QFIX 



QUANTUM OPTION 



NUMBER OF DESTINATION QUEUE 



Destination queue determined by source queue, quantum runtime 
determined by job size. 
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POSITION OPTION 


QLNKZ 


QUANTUM OPTION 


ADDRESS OF TABLE FOR DESTINATION QOEOE 



Position Option: 

= Requeue to beginning of queue 

400000 - Requeue to end of queue 
Quantum Option: 

If negative = Do not assign quantum runtime 

If positive = Format 1, address of word containing amount 
of quantum runtime to assign. 
Format 2, quantum runtime is to be computed 
(within QLNKZ routine) . 

QFIX/QLNKZ Name of requeue routine. 

At QLNKZ, the destination queue is determined by indexing into the 

specified destination queue table. At present, only one such taole 

exists (QRQTBL in COMMON) ; it contains one entry for each processor 
queue. 

INDEX DESTINATION 

HPQ Same EPQ 

PQ1 PQ2 

PQ2 PQ2 

f the transfer table requests computation of quantum runtime, it is 
alculated in routine CMPQRT as shown in the following. 

/ min{QMX,QAD+K*QML) 

quantum runtime s ( 

\ QRANGE 

QKX is taken from table QMXTAB by the destination queue; 
it is the largest quantum runtime permitted for that 
queue. 

QAD is taken from table QADTAB by the destination queue; 
it is the base quantum runtime for all joos. 

K is the size of the job in K (1024 words) . 

QML is taken from table QMLTAB by the destination queue; 
it is a multiplier factor used to modify quantum 
runtime by job size. 

ORANGE is used to scale the multiplier factor. 
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The tables QMXTAS, QADTAB, QMLTAB, and QRQTAB each have one entry per 
processor queue. This makes it easier to assign quantum runtimes 
differently for the processor queues (that is, HPQs, PQ1 , and ?Q2) . 

At QFIX, the destination queue is specified by the indicated transfer 
table. However, if the destination is a processor queue and the job's 
current HPQ indication (pointed to by HPQPNT) is nonzero, the job will 
be placed in that H?Q. 

In transfer tables, the codes could be set as shown in Table 3-4. 



Example of Codes Set For Transfer Tables 



Label 



Content 



Description 



QNOLW: 


400000,, QFIX 
-1,,-NULQ 


QSTOP: 




QSTQPW: 


4000.00,, QFIX 
-1, ,-STOPQ 


QJDCW: 


400000,, QFIX 
-1,,-JDCQ 


QCMW: 


400000, ,QFIX 
-1,,-CMQ 


QTSW: * 




QRNW: 


400000,, QFIX 
QADTAB ,,-PQl 


QRNW1 : 


400QQ0,,QFIX 
-1,,-PQl 


QTIQWW: 


400000,, QFIX 
-1,,-TIOWQ 


QSLPW; 


400000,, QFIX 

-1,,-SLPQ 


QTIME: 


400000,, QLNK2T 




0,,QRQTBL 


QSWW: 


400000,, QFIX 
-1,,-EWQ 


QRNW2: 


400000,, QFIX 
-1,,-?Q2 



Transfer to back of NULQ. 

Do not assign quantum runtime. 



Transfer to back of STOPQ. 

Do not assign quantum runtime. 

Transfer to bacic of JDCQ. 

Do not assign quantum runtime. 

Transfer to back of CMQ. 

Do not assign quantum runtime. 



Transfer to back of PQ1. 

Assign quantum runtime by QADTAB. 

Transfer to back of PQ1. 

Do not assign quantum runtime. 

Transfer to back of TTY I/O wait. 
Do not assign quantum runtime. 

Transfer to back of sleep queue. 
Do not assign quantum runtime. 

Transfer to queue specified by 

QRQTBL. 

Compute quantum runtime in QLNK2. 

Transfer to back of EWQ. 

Do not assign quantum runtime. 

Transfer to back of PQ2. 

Do not assign quantum runtime. 



Table J3TCQ contains all the master queues. The table has one entry 
for each job and two entries for each master queue. Each master queue 
requires two entries because the master queues are divided between 
jobs with core and jobs with no core. These entries are referred no 
as queue headers. The queue headers are defined in the negative 
direction from J3TCQ. 
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If there are n master queues, the first n entries above JBTCQ in the 
negative direction are the in-core headers and the next n entries in 
the negative direction are the out-core headers. Each queue has an 
associated queue number. The location of the in-core header for a 
queue is JBTCQ minus the queue number. The location of the out-core 
header for a queue is JBTCQ minus the number of master queues minus 
the queue number. The entry for each job is located at JBTCQ plus the 
job number. The zero entry of JBTCQ, which would correspond to the 
null job, is not used. 

Each entry in the table contains a pointer to the previous entry in 
the left half and a pointer to the next entry in the right half. 
Therefore, the queue headers contain a pointer to the last job in the 
left half and a pointer to the first job in the right half. The last 
job in the queue has a pointer back to the queue header (that is, a 
negative number) in the right half. Similarly, the left half of the 
first job in the queue points to the header. If a queue is empty, 
both pointers in the queue header point to itself. 

For example, assume queue 2 contains jobs 1,4,2 in core and jobs 5,7,3 
not in core. Queue 2 could be represented in JBTCQ as follows: 



MXQUE-2 


3 


5 


MXQUE-1 






-MXQUE 








• 


-3 






-2 


2 


1 


• 

-1 






JBTCQ 






1 


-2 


4 


2 


4 


-2 


3 


7 


-MXQUE-2 


4 


1 


2 


5 


-MXQUE-2 


7 


6 


1 

I 


7 


5 


3 



queue header for section of queue 
with no core 



queue header for section of queue 
with core 



entry for job 1 
entry for job 2 
entry for job 3 
entry for job 4 
entry for job 5 

entry for job 7 



In Class Scheduler mode, all jobs in PQ2 also have an entry in the 
table JBTCSQ. This table has headers that correspond to scheduler 
classes, also referred to as subqueues. Each subqueue has one header 
for jobs with core and one for jobs with no core. 

The location of the in-core header for a subqueue is JBTCSQ minus one 
minus the class number. The location of the out-core header is J3TCSQ 
minus the number of classes minus one minus the class number. 
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Entries in JBTCSQ consist of a forward pointer and a backward pointer 
as in the master queues. Only the entries corresponding to headers or 
jobs in PQ2 have valid pointers. 

For example, suppose class contains jobs 1,4,2 in core and job 5 not 
in core", and class 1 contains no jobs in core and jobs 7,3 not in 
core. Subqueues and 1 could be represented in JBTCSQ as follows: 



-M.CLSN-2 

-M.CLSN-1 

-M.CLSN 

-3 
-2 

-1 

JBTCSQ 

L 
2 
3 
4 
5 
6 
7 



3 


7 


5 


S 






♦ 






-2 


-2 


2 


1 






-1 


4 


4 


-1 


7 


-M.CLSN-2 


1 


2 


-M.CLSN-1 


-M.CLSN-1 






-M.CLSN- 2 


| 3 



queue header for class 1 with no 
core 

queue header for class with no 
core 



queue header for class 1 with 
core 

queue header for class with 
core 



entry for job 1 
entry for job 2 
entry for job 3 
entry for job 4 
entry for joo 5 

entry for job 7 



In addition,- jobs in PQ2 that have core will also have an entry in 
either an input list (JBTJIL) or an output list (JBTOLS) . The input 
list gives the order in which jobs with in-core protect time entered 
PQ2. The output list gives the order in which jobs were requeued to 
PQ2 after expiring in-core protect time. In Class Scheduler mode, 
each of these queues is subdivided into normal jobs and background 
batch jobs. 

For example, suppose the just-swappad-in list contains jobs 1,4,2 in 

the regular chain and job 5 in the background batch chain, and the 

output list contains jobs 7,3 in the regular chain and no jobs in the 
background batch chain. The queues would be as follows: 



-3BQ 

-JIQ 



5 3 
2 1 



queue header for background batch 
just-swapper-in list 

queue header for regular 
just-swapped-in list 
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JBTJIL 

1 
2 
3 
4 
5 

-OBQ 

-OLQ 

JBTOLS 

1 
2 
3 
4 
5 
6 
7 



-JIQ 


4 


4 


-JIQ 






1 


2 


-BBQ 


-BBQ 


-OBQ 


-OBQ 


3 


7 














7 


-OLQ 














-OLQ 


3 



entry for job 1 
entry for job 2 

entry for job 4 
entry for job 5 



queue header for background batch 
output list 

queue header for regular 
output list 



entry for job 3 



entry for job 7 



3.17 SUBROUTINE DICLNK 

The DICLNK subroutine moves a job from the in-core queues to the 
corresponding out-core queues. This subroutine is called wnen a job 
gives up core. 



3.13 SUBROUTINE IICLNK 

The IICLNK subroutine moves a job from the out-core queues to the 
corresponding in-core queues. This subroutine is called wnen core is 
assigned to a job. 



3.19 SUBROUTINE DCCLNK 

The DCCLNK subroutine is used by DICLNK and IICLNK to delete a job 
from its current master queue and from its subqueue in tne class 
scheduler. The scheduler is locked while the linked lists are oeing 
updated. 
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3.20 SUBROUTINES ICCLNK AND ICSLNK 

■The ICCLNK and ICSLNK subroutines are used by DICLNK and IICLNK to 
insert a job into its proper master queue and subqueue. 



3.21 SUBROUTINE INOLST 

The INOLST subroutine inserts a job in the output list if it has core 
and is eligible to be swapped out. If the job is already in the 
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subroutine inserts background batch jobs in the background batch chain 
and other jobs in the regular chain. INOLST is called when a job is 
requeued to PQ2. 



3.22 SUBROUTINE DLOLST 

The DLOLST subroutine deletes a job from the output list if it is in 
one. This subroutine is called when a job is requeued unless it is 
going from PQ1 to PQ2. It is also called when a job is swapped in or 
out. 



3.23 SUBROUTINE DLJILS 

The DLJILS subroutine deletes a job from the just-swapped- in list. 
This subroutine is called when a job is requeued from ?Q2 and when it 
is swapped out. 



3.24 QSCAN THROUGH FSQFOR SECTION 

QSCAN scans the queues as specified by a scan table. It returns the 
job number of the next job in AC J. If the calling routine wishes to 
reject a job and continue the scan it must JRST (T2) . The calling 
sequence is shown below. 

MOVEI 0, address of scan table 

JSP Tl, QSCAN 

Return here when no more jobs. 
Return here with next job. 

The format of the scan table is shown below. 

SCANTA3s XWD Q1,C0DE1 



XWD Qn,C0DEn 

2 (Zero terminates table) 

In this case, CODE specifies the routine used for scanning. Table 3-5 
lists the possible codes and their meanings. 
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Table 3-5 
Possible Codes 



Code 



Meaning 



QFOR Scans whole queue forward. First scans the in-core 
chain, then the out-core chain. 

QBAK Scans whole queue backward. First scans the out-core 
chain, then the in-core chain. 

IQFOR Scans in-core queue forward. 

IQBAK Scans in-core queue backward. 

IQFOR1 Scans in-core queue for first member. 

IQBAKl Scans in-core queue backward (all but first member) . 

OQFOR Scans out-core queue forward. 

OQBAK Scans out-core queue backward. 

0QF0R1 Scans out-core queue for first member. 

0QBAK1 Scans out-core queue backward (all but first member) . 

SQFOR Scans out-core subqueues (?Q2 class swap-in scan) . 

3QFOR . Scans out-core background batch subqueue (PQ2 class 
swap- in scan) . 

ISSFOR Scans in-core subqueues (PQ2 class scheduling scan) . 

IBBFOR Scans in-core background batch subqueue (PQ2 class 
scheduling scan) . 

03SF0R Scans out-core subqueues (PQ2 class lost- time scan) . 

IRRFOR Scans just-swapped- in queue, then PQ2 in-core queue 
(PQ2 Round Robin scheduling scan) . 

IGFOR Scans just-swapped- in queue and jobs waiting for a 
high segment as a result of a GETSEG 0U0 a certain 
percentage of the time (PQ2 swap-in scan) . 

OLFOR Scans background batch output queue, then background 
batch just-swapped- in queue, then regular output 
queue, then PQ2 in-core queue backward (?Q2 output 
scan) . 



3.25 FSQFOR THROUGH BQFOR SECTION 

The SQFOR routine is used by the class scheduler for the PQ2 swap- in 
scan. First, it scans the primary class. Second, it scans any 
classes with nonzero secondary allocations. 

At FSQFOR, set the SWPFAR flag to indicate that the swapper reached 
fair territory. 
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If in Round Robin mode (RRFLAG = 0} , go to OQFOR to scan the PQ2 
out-core queue. Then, subtract one from SQCNT, and if it reaches 
zero, call SQINI to reinitialize the primary scan pointer. Finally, 
load the class number of the current primary subqueue into AC J. 

At SQFORA, scan the out-core subqueue for that class. 

From SQFOR1 to SQFOR2, build the secondary scan table SQSCAN. The 
primary class and any classes with no jobs in the out-core chain are 
rejected for efficiency. Any class with the fixed swap-in bit set 
(bit of CLSSTS = l)"is also rejected, because this class is allowed 
to swap only when it is the primary class. All other classes with 
secondary allocations (CLSQTA>0) are stored in the SQSCAN table in the 
form XWD -CLASS-1, secondary allocation. The sum of the secondary 
allocations? of all classes entered into the table is accumulated in 
SQSUM. 

At SQFOR3 f select a random integer in the range to SQSOM-1. This 
integer determines which class will be selected next for the secondary 
scan. The secondary allocations of each entry in SQSCAN are 
successively subtracted from the random integer until it goes 
negative. The class that causes it to go negative is selected as the 
next class to scan. Therefore, the probability of any given class 
being selected is equal to its secondary allocation divided by SQSUM. 

Eliminate the selected class from the SQSCAN table by moving the top 
entry down on top of it and subtracting its secondary allocation from 
SQSUM. 

At SQFORB, scan the out-core subqueue for the selected class. If no 

job is selected by the scan, decrement the count of classes left in 

SQSCAN. If any classes remain, go to SQFOR3 to select another class, 
otherwise go to the SQFOR routine. 



3.26 EQFOR THROUGH I5SFOR SECTION 

The BQFOR routine is used by the Class Scheduler to scan for 
background batch swap- in. 

If in Round Robin mode (RRFLAG = 0) , exit from the BQFOR routine. If 
no background batch class is defined (38SU8Q<0) or not enough time has 
elapsed since the. last background batch swap- in ( UPTIME < SCNBBS ) , exit 
from the BQFOR routine. Otherwise, scan the out-core subqueue for the 
background batch class in 3BFOR2. While scanning background batch, 
set BBFLAG to -1. 

The SQINI routine is used to initialize the swapper's primary scan 
pointer. The counter SQCNT is initialized to 100 and indicates the 
number of entries left in this pass through the table. The byte 
pointer SQPNT is initialized to point to the imaginary byte preceding 
the first entry in PSQTAB. 




for a time interval that approximates the average swap-in 
(SCDSWP) . 

If in Round Robin mode, exit from the SQTEST routine. Then, add one 
to the count of how many ticks the current primary class has been 
scanned (3CNSWP) . If this class has been scanned often enough 
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(SCNSW?>SCDSW?) , clear SCNSWP and allow the primary scan pointer to 
advance to the next class. Otherwise, reset the primary scan pointer 
so that the same class will be scanned on the next tick. Then, add 
one to the count of primary classes left (SQCNT) , and decrement the 
byte pointer (SQPNT) so it will point to the current class when 

The routine RAND returns a random integer less than 2(17) in index T2. 
The algorithm is multiplicative modulo 2(35). 

Multiply the current seed by 377775 octal. Store the low-order 35 
bits as the new seed. Out of these,- extract the leftmost 17 bits as 
the current random number. 



3.27 ISSFOR THROUGH OSSFOR SECTION 

The ISSFOR routine is used by the Class Scheduler mode for its PQ2 
scheduling scan. The subqueues are scanned in the order specified by 
the subqueue scheduling scan table (SSSCAN for CPUO and SSSCNl for 
CPU1) . 

The scan table is built at the beginning of each microscheduling 
interval by the routine SCDQTA. Each entry in the table is of the 
form -CLASS-1. The first entry in the table is the primary class. 
The percentage of time that each class is selected as the primary 
class is determined by the primary percentage for that class. The 
remaining entries in the scan table are the secondary classes. SCDQTA 
uses probability to determine the order of the secondary classes, and 
uses the secondary allocation of each class to determine its relative 
priority. 

To ensure a minimum level of response and to prevent core from 
becoming clogged with jobs that come from classes with low primary 
percentages, a portion of each microscheduling interval is dedicated 
to running jobs in the order in which they were swapped in. The 
response fairness factor (SCDJIL) controls the percentage of time that 
this special scan is in effect. 

The code for ISSFOR is as follows. 

If in Round Robin mode (RRFLAG - 0} , go to IRRFOR to scan PQ2 forward. 
If response fairness is in effect ( UPTIME <SCNJ ID , scan tne 
just-swaoped-in queue at SJFORA. If response fairness is not in 
effect or no runnable job is found in the just- swapped- in queue, scan 
the subqueues in the order specified by the subqueue scheduling scan 
table. Then, set AC M to the base- address of the table. At SSFORl , 
if the word pointed to by M is zero, go to SSFOR2 because the end of 
the table has been reached. Otherwise, scan the in-core subqueue for 
the class pointed to by M. If no runnable job is found, add one to M 
and go to SSFORl to scan the next class in the scan table. 

At SSFOR2, the primary and all secondary subqueues have been scanned. 
If the just-swapped- in queue was not scanned previously 
(DPTIME2SCNJIL) , go to ILFOR1 and scan it now. 

The only jobs that would be scanned by this final scan of JBTJIQ are 
jobs that are in a class with no secondary allocation that nave not 
yet expired 1 time slice. Because these jobs will be scanned at the 
beginning of the next microscheduling interval (when UPTIMEOCNJIL) , 
they can be scheduled now because no other PQ2 timesharing jobs are 
runnable. 
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The routine IBBFOR is used by Che Class Scheduler mode to schedule 
background batch jobs. 

If in Round Robin mode (RRFLAG = 0), exit from IBBFOR. To indicate 
that the background batch is being scanned, set 3BFLAG to -1. Then, 
scan the background batch just-swapped- in queue (J8T3BQ) at BBFORA. 
This action schedules any background batch job that has not yet 
expired one time slice ahead of those that have expired their time 
slices. Finally, scan the in-core subqueue for the background batch 
class at BBFORB. Zero 3BFLAG and exit IBBFOR if no runnable job has 
been found. 



3.28 OSSFOR THROUGH ILFOR SECTION 

The routine OSSFOR is used for the Class-Scheduler-mode PQ2 lost-time 
scan. 

Set M to the base address of the scan table. For each entry in the 
table, scan the out-core subqueue for that class at SSFORB. When all 
classes in the scan table have been processed, scan the out-core 
subqueue for the background batch class at OBBFOR. 

The Round Robin scheduler uses the routine IRRFOR for its PQ2 
scheduling scan. For best response, all ?Q2 jobs that have not yet 
expired 1 time slice are scheduled ahead of those that have expired at 
least 1 time slice. 

Scan the just- swapped- in queue (JBTJIQ) at RJFORA. Then, go to IQFOR 
to scan the full PQ2 in-core queue. 



3.29 ILFOR THROUGH SAVSOM SECTION 

The swapper uses routine ILFOR to scan for ?Q2 jobs that have done 

GETSEGs and need to be linked up to their high segments. The in-core 

fairness factor (SCDIOF) controls how often these jobs are scanned 
ahead of regular PQ2 jobs. 

ILFOR generates a random number in the range to 99. If this number 
is greater than or equal to SCDIOF, exit the ILFOR routine. 
Otherwise, scan the just-swapped-in queue at ILFORA (ignore jobs with 
JS.HNG * 1) . 

The swapper uses routine OLFOR to scan PQ2 for output. 

The swapper also saves SOMCOR in the temporary variable SAVSUM and 
performs the following tasks: 

1. Scans the background batch output queue at OLFORA. 

2. Scans the background batch just-swapped-in queue at 0LFOR3. 

3. Scans the regular output queue at OLFORC. 

4. Resets SUMCOR to SAVSUM 

5. Scans the PQ2 in-core queue backward at IQSAK. 

It is necessary to reset SUMCOR because some jobs will be scanned 
twice, once by OLFOR and once by IQBAK. 
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Note that all background batch jobs are swapped out ahead of any PQ2 
timesharing jobs. If tne system administrator wishes to give 
background batch jobs that have not expired 1 time slice a higher 
oriority for remaining in core than timesharing jobs that have expired 
their time slices, the 0LF0R3 code should be moved below the OLFORC 
code. 
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CHAPTER 4 
DETAILED DESCRIPTION OF SWAPPER 



The swapper code is dependent on a number of assembly switches. This 
discussion assumes a KI10 processor {FTKI10 = -1) , with the 
virtual-memory option (FTVM = -1) and the high-availability option 
(FTDHIA = -1} , which does not swap PDBs (FTPDBS * 0) . 

The swapper is entered at the label SWAP. It determines whether or 
not any jobs require swap-in or swap-out, and if so sets up the 
required swap control information in tables for VMSER and SWPSER. The 
actual swap (and any virtual-memory paging) is performed at interrupt 
level in VMSER and SWPSER. 

Most operations started by the swapper require several clock ticks to 
run to completion*. A number of flags are used to remember previous 
starts. Table 4-1 is a list of the important flags and data items 
used by the swapper. 



Table 4-1 
Important Flags and Data Items 



Code 



Meaning 



FIT Job chosen by QSCAN to be swapped in. 

FORCE Job chosen to be swapped out. 

FORCEF Job being forced out but waiting to give up disk 
resource. 

SWPIN Job number associated with high segment being swapped 
in. 

SWPOOT Job number associated with high segment swapped out 
last. 

LASIN Last segment swapped in. 

LASOOT Last segment swapped out. 

SWPERC LH=number swap errors. RH=number pages lost — bits 
18-23=err flags. 

MAXJ3N Job number of job to swap out. . 

SUMCOR Total amount core found so far of eligible jobs to 
swan out. 
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Table 4-1 (Cont.) 
Important Flags and Data Itsras 

Code Meaning 

INFLG NOFIT flag — same job waiting to be swapped in for 
360 ticks. 

INFLGJ Frustrated job waiting to be swapped in. 

INFLGC Time frustrated job started waiting. 

SPRCNT Somber swap operations in progress. 

SWPCJT Number jobs finished with data transmission, waiting 
for cleanup. 

The overall operation of the swapper is described in tne following. 

The next job to be swapped in is selected by the input scan and stored 
in the item FIT. If the job will fit in available free core (unused) , 
it is immediately swapped in. If the job will not fit, out would fit 
if the idle and dormant high segments were deleted from core, enough 
are deleted until the job will fit. The job is then swapped in. 

If the job would not fit even if all idle and dormant high segments 
were deleted, the swapper checks to see if the job would fit if all 
jobs eligible to swap were swapped out. If no, the swapper exits and 
checks again each clock tick. If yes, jobs are sequentially selected 
by the output scan and put in the item FORCE to be forced out. 

When all I/O has stopped and all sharable disk resources have been 
given up, the job is swapped out. While waiting for I/O to stop or 
resources to be given up, the swapper exits and rachecks on the 
occurrence of each clock tick. 

After each job is swapped out, control returns to checking whether the 
job will fit after all idle and dormant segments have been deleted. 
When the job will fit, the high segments are deleted and the job is 
swapped in. 

The following sections provide a description of the swapper at the 
level of the macro code. The labels referenced are in the last half 
of the module SCHED1. 



4.1 SWAP TO SWAP1 SECTION 

This section provides improved swapping response for H?Q jobs.. 

If FIT is zero, go to SWAPOA. (equivalent to SWAP1) ; otherwise, check 
(SKIPG J, .CPRTF(P4) ) to see if an 3PQ job wants to be swapped in. If 
no, go to SWAPOA. If yes, and this job is of higher priority than the 

TOO in C*i, fgSCU UiAC JUU UULiSHWA^, i-ii Cil UUi.C92 u-iS jvy*» w« i n^ ta ^<» 

has a high segment that is already in core, in which case, go to 
SWAPOA. 
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4.2 SWAP1 TO FININO SECTION 

This section determines if swapping input or output has just finished, 
and if so, branches, to the appropriate wrapr-up routine. 

At SWAPl, if no swapping requests have just finished (SWPCNT =0), go 
to SWP2 and bypass swapping wrap-up. Otherwise, at FININN test 
whether the swap just completed was swap- in (10 - 0) or swap-out 
(10 = 1) . For swap-out, go to FINOOT. For swap-in, check for swap 
read error (sign bit of s = 1) and if ves, go to INERR. Otherwise, go 
to FININO. 



4.3 FININO TO INERR SECTION 

This section does the wrap-up after a segment has been swapped in. 

At FININO, if the segment just swapped in is a high segment (job 
number greater than JOSMAX) , go to FININH. Otherwise, for low segment 
return swapper space and delete SWPLST entry (at GIVBAK) , then go to 
FININ5 (which jumps to FININl) . 

At FININH, get the job number of low segment associated with this high 
segment (from 3WPIN) . If this job is migrating to a new device 
(SWPIN ■ MIGRAT) , clear the high-segment swapping space (at ZERSWP) so 
that it will be swapped out to a different unit and fall through at 
FININl . 

At FININl, using FINIS subroutine at SEGCON: 

1. If a low segment was just swapped in and the associated high 
segment is in transit, check to see if there is another swap 
list entry completed (subroutine NXTSLE) . If yes, go to 
FININN to process it. If no, exit from the swapper (POPJ) . 

2. If a low segment was just swapped in and it has an associated 
high segment that is not in core, set AC J to the associated 
high-segment number, and go to FININ2 to initiate swap- in. 

3. if there is no high segment, or it is already swapped in, or 
we just swapped it in, then go to FININ3 to wrap-up job 
swap-in with J = low-segment job number. (Note that in 
virtual-memory systems nonsharable high segments are treated 
as part of the low segment.) 

At FININ2, go to FIT! to initiate swap-in for high segment (job slot 
indicated by AC J) . 

At FININ3, both segments are now in memory. Do a wrap-up for the job 
just swapped in. At this point, J = low-segment number, regardless of 
the order in which the jobs segments are actually swapped in. 

Use IMGIN and IMGOUT to determine if the job size has decreased. If 
so, add the amount of decrease to the counter for the amount of 
virtual memory available (VIRTAL) . 

Call subroutine UNSWAP for housekeeping on various swapper flags, to 
give back disk space, and to mark the job as swaoped in 
(SWP = 0, SHF = 0) . 

Clear the job scanned by the scheduler (JS.SCN) , the job could not be 
forced out flag (JS.HNG) , and the job forced out by timer (JS.TFO) for 
the just-swapped- in job. 
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If the just- swapped- in job is migrating (MIGRAT = JOB *) or is not in 
a processor queue or command wait queue, flag the job as eligible to 
swap-out (PDMSWP = 1} and go to FININ7. This procedure is needed for 
jobs that are requeued out of the queue the'/ were in when they were 
selected for swao-in (usually by command decoder) . Otherwise, the job 
can be marked as not eligible for swap-out (PDMSWP =0) and can be in 
a queue that is not decremented for in-core protect time. 

If the swap-in was caused by a GETSEG (JS.MNQ = 1), go to FINI2J7. 
This -avoids reassigning time slices, which would allow jobs doing 
GETSEGs to take over the system. 

Assign in-core protect time (subroutine ASICPT) for just-swapped- in 
job (bits 1 to 17 in .PDIPT5 . Then, mark the job not to be swapped 
(PDMSWP = 0) . If the job is in a processor queue, assign a quantum 
runtime deoending on which queue the job is in. If the job is on the 
swap-out list, delete it from the list (subroutine DLOLST) . 

If the just-swapped- in job was not background batch, go to FININ6. 
Otherwise, set the background batch bit (JS.BBJ) to 1, put the job in 
the just-swapped- in background batch queue, and go to FININ7 . 

At FINIU6, clear the background batch bit (JS.BBJ =0). If the job is 
in PQ2, put it in the just-swapped- in timesharing queue. 

At FIHIN7, clear the no-new-quanta bit (J3.NNQ » 0). Clear the 
background batch fit flag (3BFIT) . Clear the frustration indicators 
IUFLG, INFLGJ, and IKFLGC. Clear the flag that FIT was zeroed by an 
HPQ job (.PDHZF), and go to SWP1. 



4.4 I3ERR TO FINODT SECTION 

This section processes input swap read errors. 

\t INERR, if the segment is a high segment, go to INSRR2. Otherwise, 
call 3WPRSC to record errors, clear JACCT, call 2APUSR co clear all 
DDBs and I/O channels, and call CLRJOB to clear the protected part of 
the job data area. Then, fall through to INERR2. 

At INERR2, save the segment number (PUSH P,J) , and call SEGERR. 

For low segments, SEGERR returns immediately (POPJ ?,) • For high 
segments, SEGERR sets the high-segment error flag (SSRR) to 1, clears 
JBTNAM, and returns the virtual ' swapping space for the high segment if 
this is the first time the high segment had the error (SERR was equal 
to 0} . SEGERR returns with J * associated low segment. 

Restore segment number (POP ?,J). If the segment was not user page 
mac go to FINIM0. Otherwise, call GVPAGS and give back core for the 
oage mao and segment, delete the SWPLST entry, clear JBTADR, J3TDPM, 
and J3TSWP. If the joo has a nonsharable high segment, clear JBTSGN. 
Call KILHGH to remove the high segment from tne address space, and go 
to ONSWAP. 
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4.5 FINOUT TO SWP1 SECTION 

This section does the wrap-up after a segment has been swapped out. 

If anv errors have occurred (RH of AC S * Q) , go to OUTERS.. 
Otherwise, delete the 5WPLST entry (at DLTSL2) . 

If no jobs are migrating or the segment just swapped was a hign 
seamen t, go to 'FINOU2. otherwise, check to see if the job has 
completely migrated (at PGOFF) , and if so, mark the job as completed 
[js'mig *"i) . 

At FXN002, set R to the base address for segment and call KCOREl to 
return core. 

At FINOOO, call FINOT and; 

Return + 1 with J set to low-segment number if the segment 
just swapped was a high segment and there is a low 
segment yet to swap. Go' to FORCEL to swap out low 
segment. 

Return + 2 if just swapped a low segment and swapping is all 
finished for" this user. Fall through to 5WP1. 



4.S SWPl TO FIT1 SECTION 

This section contains mucn of the overall control code of the swapper, 
plus the code for the swapping input scan. (See Table 4-2.) 

At SWPl clear the FINISH flag (largely meaningless for virtual-memory 
svstems) . At 5WP2, if there is a job to be forced out (FORCE = job 
number * 0] , go to FORCE1 and try to swap it out. Otherwise, go to 
FITO. 

At FITO, if there are swaps in progress (5PRCNT,*0) , go to CEKXPN. 
Otherwise, if a swap has completed (3WPCNT*0) , go to SWAP1. If there 
is a job waiting co swap in (FIT ■ job number 4 01 , go to FITl. 
Otherwise, input scan by performing the following tasks. 

1. Zero 3BFLAG to indicate that background batch is not 
currently being scanned. 

2. Sero SWPFAR to indicate that the swap- in scan did not yet 
reach fair territory (PQ2) . 

3. Set to the proper scan table depending on the swapper 
fairness count. 
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Queue 

HPQs 

cm 

PQ1 
PQ2 

?Q1 
PQ2 
?Q2 
?Q2 
PQ2 



Table 4-2 

Primary and Secondary Scan Swapping Tables 
Primary Scan 



I SCAN 



Routine 

QFOR 

Q?OR 

IQFOR 

ILFOR 

OQFOR 

OQFOR (Round 
Robin mode) 
SQFOR (Class 
Scheduler mode) 
IQFOR 

3QFOR (Class 
Scheduler node) 



Secondary Scan 
ISCANl 



Queue 


Routine 


HPQs 


QFOR 


CMQ 


QFOR 


PQ2 

?Q2 
PQ1 


OQFOR (Round 
Robin mode) 
SQFOR (Class 
Scheduler node) 
OQFOR 


PQ2 


ILFOR 


?Q1 


IQFOR 


?Q2 


IQFOR 


?Q2 


3QFOR (Class 



Scheduler mode) 



If the number of jobs in PQ1 swapped in a row is lass than tne maximum 
allowed (SWPIFC less than HAXIFC) , use the primary scan. Otherwise, 
use the secondary scan. 

At FITPRM, call QSCAM to do an input scan. At Return ■>■ 1 , if all 

cueues have been searched, go to ZCSX5S (no joos waiting to swap in 

were found) . At Return +• 2 and following, process jobs recurr.ad from 
scan (code below) . 



If the job is expanding (JXPS = 1 or JS.XPN 



.) , or if all 



01 



tne 



job's segments are in core (SWP = 0) for low segment, or if the high 
segment is expanding (determined by CKXPM) , reject the job and request 
the next job from QSCAN (JRST (T2) ) . 

Otherwise, select job as next job to swap in (sat FIT = job number) . 

If the job in FIT was selected in the background bacch scan, remember 
the job number in 33FIT. 

Otherwise, clear the SCNSW? counter (used to control the advance of 
the swapper primary scan pointer when there are no jobs to be swapped 

in) . 

If the swap-in scan reached fair territory (that is, the job selected 
was in ?Q2) , clear the counter for the number of unfair scans 
(3WPIFC) . Otherwise, add one to the number of unfair scans (5WPIFC) , 
and go to FIT1A. 

Only one job at a time is" entered into the swapping saaiss 
inter ruse level routines to process. This is because 



unrunnabie ( SW? 



1) as soon as they are enterea 



this would result in too much core memory being tiec up with :coa -.-at 
are not runnabie. The only exception is jobs that are expanding, 
which may be stacked into the swapping tables because they are not 
runnabie in anv case. 
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4.7 FITl TO OOTERR SECTION 

This section controls the swap-in process so that it proceeds in an 
optimum manner. 

If the job in FIT will fit in available free core, swao it in (at 
SWAP!) . 

Otherwise, if the job will fit when a number of idle or dormant high 
segments have been deleted, then delete that numoer of high segments 
and swap the job in at SWAS1. 

Otherwise, go to SCNOOT to see if enough space is available to swap 
the job in. If there is not enough space, SCNOOT will exit and FITl 
will be reentered next tick • for a reevaluation. If enough space 
exists, SCNOOT will put the next job to swap out into FORCE to force 
it out. 

In detail, the logic of this section is as follows t 

At FITl, save the job number in FIT because it may just have been 
selected for swap-in {if entered from, FININ2) . 

Check (at CKXPN) to see if this is a low segment for which the 
associated high segment is expanding (perhaps by some other job that 
is sharing it) . If yes, go to NCFIT2 to deselect this job (set FIT to 
0) and exit the swapper. (The input scan will not select such a job. 
The output scan will swap out and expand the high segment. Only then 
may any of the jobs swap in again.) if no, fall through to FITlA. 

At FITlA, put the size of the lew segment plus the size of the page 
map (OPMPSZ) in ?1 in preparation for calling FITSIZ. In special 
cases where low segment is already in core (JBTADRf J} =0} , set PI to 
zero. 

Call the routine FITSIZ is to determine if the job will fit in free 
core plus the space occupied by idle and dormant segments. (Idle 
segments are high segments that are linked to low segments on the 
swapper but not to any low segments in core. Dormant high segments 
are not connected to any low segments, either on the swapper or in 
core.) FITSIZ determines if the job will fit by testing for the 
existence of a high segment, and if required, by adding its size to 
the total job size in ?1. The total job size (?i) is then compared 
with free + dormant +■ idle space (COSTAL) . Returns from FITSIZ are as 
follows: 

Return + 1 Job will not fit, go to SCNOCT (to try to swap 
jobs out) . 

Return + 2 Job will fit, go to FIT1S (swap into free core, or 
delete high segments as required? then swap in) . 

At FIT13, the job being swapped is known to fit in 
free + dormant + idle core. If the job will fit in free core (job 
size less than or equal to 3IGH0L) , go to SWAP1 and swap the job in. 
(On KAlOs, 3IGHOL is the largest contiguous block of available memory, 
not all of free core. To get a fit, KAlOs nay need to shuffle to 
increase 3IGHOL.) If the job will not fit, use subroutine FRECSi to 
delete idle or dormant segments. {For KAlOs, this implies a 
preference to delete idle or dormant segments before shuffling.) 

FRECRi searches for dormant and then idle high segments to delete. It 
will not delete a high segment that is needed by the job being swapped 
in. Returns are as follows: 
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Return +■ 1 Sigh segment selected has no copy on swapper/ so 
it must be swapped out. Go to FORIDL (high' 
segment was just associated by get segment, or for 
non-virtual-memory systems high segment was 
r.or.sharabla).'. ■ 

Return +> 2 One idle or dormant segment has been deleted, go 
to FIT13 and see if job now fits. 

Return - 3 All idle and dormant segments have been delated. 
Job still did not fit; execute code starting at 
5HFPAT . 

At SHFPAT, check to see if there are any holes in memory that the 
shuffler could eliminate (HOLEF # 0) . If no (always no for KI10 and 
KL10) , go to SCNOOT to try to swap some jobs out. If yes, call the 
shuffler and successively move jobs (only for KAlOs) . 



4.3 OCTSRR TO SWPREC SECTION 

This section processes swap-out errors. 

At OUTERR, if the error was caused by the disk system, go to OOTSRi . 
Otherwise, fail through to the coda described below. 

For memory parity errors (read from memory by the swapping channel) 
record the* error flags and the number of swap errors (3WPRC1) . Then, 
call 3GESWE to take action if the segment being swapped out was a high 
segment. 

Returns from EGHSWE are as follows r 

Return +• 1 Job was a high segment, the swap-out error message 
has been sent to all low segments attached to this 
high segment. Mame of high segment is cleared so 
no others can connect to it (C1RMAM) . Go to 
OUTSRO . 

Return * 2 Job was a low segment. Fall through to code 
described below. 



r low segments, if the error was in a protected part of the job data 
ea, return the swap space (CEGSWP) , and clear ail user ODSs and I/O 



Fo 
ar 
channels (ZAPOSR) . 



Print error message and stop job (3WOMES) , and clear the swap error 
(SL.ERR and SL.CHN) in swap tables (3WPLST (Pi) ) . 

Reenter the swapper to start a new operation (at 3WAP1) . 

At CUTERl, the swap-out error is known to be a device error. Call 
SWPREC to record the errors, reset the map at MAP SAX, and try to swap 
out again in a different place (at 3WAP0) ." The old copy is left on 
disk so that the oad area wiii not be used again. 
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4 . 9 SWPRSC SUBROUTINE 

This subroutine records the amount of swapping space lost because of 
swap errors (in and out). It falls through to the SWPRCl subroutine 
that counts the number of swap errors. 

Add amount of virtual core lost to error-count register (RH SWPERC) . 

If the segment iost is not a high segment, add the page map size 
(UPMPSZ) to the amount of space lost (II) , and decrease the amount of 

.,,-,.,.^1 „ n r* IVTS"in K.» sm/tii-** 1 ««-♦■ IVll Call *h»nn.-ih Wn CSJOOfl 



4.10 SWPRCl SUBROUTINE :;••"_- 

This subroutine stores the error flags in SWPERC from bits 18 through 
23 in S and adds 1 to the number of swap errors (LE SWPERC) . 



4.11 SCRZPN TO SCNJOB SECTION 

This section is entered at ZCKXPN if the swap input scan found no jobs 
to swap in. It checks for and forces out expanding jobs. It is also 
entered at SCNODT if jobs must be swapped out to make room for the 
next job coming in. 

At ZCXXPN, clear the swapper fairness count. In the Class Scheduler 
mode, call SQTE5T to maintain the primary scan pointer. At CHKXPN , if 
there are no expanding jobs (XJOB • Q) , go to CHKMIG (check for 
migrating jobs, then exit swapper) . Otherwise, clear the control flag 
(SCNJ3S) , and go to SCHOD0. 

At SCNOCT -set the control flag (SCNJBS) to -1. If a job has already 
been selected for swap-out (FORCE = 0) , go to SCNQCl. 

If there are no jobs waiting to expand (XJOB = Q) , go to SCNJOB. 
Otherwise, fall through to code "below. (Note, entry from ZCKXPN 
implies existence of expanding joe to be swapped. Such entry never 
causes nonexpanding jobs to be swapped.) 

Loop through bit map (XPNMA?) looking for expanding jobs. If an 
expanding job is found with JS.ENG =» or no longer has active devices 
(ANYDEV) , exit to SCNOK with J = job number. If no expanding jobs are 
found, execute STOPCD XTH, because there should have been at least one 
expanding job (because XJOB # 0) . If all expanding jobs have JS.HNG 
set, go to CHKXIG if SCSJ3S * 0, or to SCNJOB if SCNJBS - 0. 

At SCNOK, if the expanding job has core assigned (J3TADR (J) 4 0) , go 
to FORCE0 to force the job out. Otherwise, decrement the count of 
expanding jobs (XJOB) , clear the expand bit for the job (JXPN) , and go 
zo FORCS0 to try to swap the job out. (The purpose of this code is to 
keep the expand bit set until the expanding job has been placed in the 
interrupt: level swap tables, when the swap bit is set.) 

At CEK.MIG, if there are swaps in progress (S2RCNT * 0) or a swap was 
just completed (SWPCNT f 0) , go to FLGNUL and exit the swapper. 

If there are no migrating jobs (KIGRAT = G) , go to FLGNUL and exiu the 
swapper . 

If we have checked all jobs for migration (J greater EIGHJS) at 
CEKMIl, go to MIGDON. Otherwise, for all jobs that are not swapped 
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(SWP * 0), check to see if the job has any pages on the unit going 
'down (subroutine PGOFF) , and if so, put the job number in MIGRAT and 
go to FORCSO to try to swap the job out. If the job is swapped and 
has not already migrated (JS.MIG a 0) and is not currently being 
swapped, sat MIGRAT a J, and go to FIT1. 

Clear the Migration flag (set MIGRAT to 0) at MIGOON , and exit the 
swapper . 

If SCNOU1 is reached, FORCE was already set on a previous clock tics. 
If the job being forced had a sharabie resource assigned the last time 
the swapper tried to swap it out (FCRCEF » J * job we want to swap 
out) , go to FORCSl to see if job has given up all resources. 
Otherwise, go to FORIDL and make sure swap indicator is set for job 
(SWP * 1) . 

The reason for not setting SWP for a job with a sharabie resource is 

that the job must be run until it gives up all resources before it can 

be swapped out and the SWP bit prevents jobs from being selected to 
run by the scheduling scan. 



4.12 SC3JOB TO FORCSO SECTION 

This section is entered if the job being swapping in will not fit in 
free and dormant and idle core, and all expanding jobs have already 
been swapped out. Some jobs that are not expanding Bust ae swapped 
out to create more space. Swap-out begins when a sufficient number of 
jobs are eligible to" be swapped (PDMSWP ■ 1) , so that enough space 
will be available for the* job coming in. So jobs are swapped before 
this time, so that runnable jobs will be kept in core as long as 
possible. 

The routine is entered with AC ?i set to the amour.- of space (in 
pages) needed to swap the nex- job in (so that both segments are ir.) . 

it SQIJOS, sat 5CMCOR to amount of free and idle and dormant space (in 
pages) . Clear indicator for first job found to swap out (MAXJ3N) . 

Set J to segment number being swapped (from FIT) and call FITHPQ to 
set J » associated low segment if the segment being swapped was a hign 
segment. 

Save low-segment number in FITLOW. 

Set AC T4 to the job's high-priority queue number. (If not zero, this 
"will be used later to give preference to EPQ jobs that you want to 
swao in.) If the job in FIT was forced out by the timer (J5.TFO = 1) 
set* T4 to (this prevents swapper thrashing when an E?Q job and 
another job both want to run and will not fit in core simultaneously) . 

Set SCNST? so output scan will stop at queue of job being swapped in 
(when not forced by timer) . 

Set AC U to output scan table (OSCAN) and call the OSCAN suaroutine to 
select the next job to swap out. The order specified by OSCAM" is 
shown in the following: 



4-10 



DETAILED DESCRIPTION OF SWAP PES 

OSCAN 

QUEUE ROUTINE ■ 

STOPQ IQFOR 

SLPQ IQFOR 

EWQ IQFOR 

JDCQ IQSAKi 

TIOWQ IQFOR 

JDCQ ■ IQFOR1 

PQ2 OLFOR 

PQ1 IQBAK 

CMQ IQBAK 

HPQs IQBAK 

The returns from QSCAS are: 

Return + 1 All queues have been scanned, job still will not 
fit. Go to KOFIT to service timer and exit from 
the swapper. 

Return + 2 Returns next job in AC J , to be processed by code 
described below. 

If the job being scanned for swap-out is from a queue beyond the end 
of the scan limit (a greater than SCNST?) , the routine has scanned all 
jobs of lower priority than the job trying to swap in. If the timer 
has expired (5 seconds have elapsed since the low segment in FITLOW 
was selected) , then allow output scan to search all queues to 
completion (JRST .+2) so that the job being swapped in can replace 
jobs of higher priority if it has been waiting too long. Otherwise, 
go to NOFIT to service timer and exit from the swapper. 

Reject the job being scanned for output (JRST (T2) ) if it is the job 
being swapped in (that is, the low segment associatad with a high 
segment being swapoed in) , or if the job does not have core assigned 
(J3TADR (J) ) - 0) ." 

Set AC W to the address of PD3. If there is no PDB, go to SCNJ31 and 
ignore the in-core protect time check. Also, go to SCNJ31 if the job 
has the swap bit (SWP) set to 1 , or if the job going out is in 
background batch and the job coning in is not. 

At 5CNJ30 , if the job's in-core protect time has expired (PDMSWP - 1) 
and the job may be swapped (NSW? = 0) , fall througn to code described 
in the nexc paragraph. Otherwise, if the in-core protect time has not 
expired, reject the job if the job coming in is not H?Q (JUMPS T4 , 
(T2) ) . If the job coming in is in HPQ, test to see if the joo can be 
swacoed (NSW? » 0) . Reject the job if it cannot be swasnec (JRST 
(T2) ) . 
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Reject the job if it is in a processor queue (other than background 
batch) and the job coming in is background batch. 

At 5CNJ32, if the job has JS.SNG equal to 0, go to SOJ33 . Otherwise, 
call ANYD2V to see if the job still has active I/O. If yes, reject 
job for swap-out. If no, fail through to SCNJ33 . 

At SCMJ33 , execute special coda to prevent system hang in rare 
circumstances. 

Reject the selected job (JUMPS F, (T2) ) if it is in the process of 
swap-in and status indicators have not yet been properly set up 
(avoids instant swap-out) . 

Set F to the sire of job (IMGIU) plus the size of the page map 
(OPMPSZ) . Call subroutine FORSIZ in SSGCCN to estimate the 
high-segment size. 

If the selected job has a high segment, and it is in core, the 
subroutine FORSIZ adds to AC F an estimate of the high-segment size 
according to the following formula: 

Estimated Size * (High-Segment Size/In-Core Count) +1 

If a job has been selected for swap-out, go to FORCE2. Otherwise, if 
a job is running on the slave, set SW0JOB » job numoer and reject the 
job f the slave will stoo running the job at next opportunity) . If a 
job has a real high segment with a SAVE in progress (AUYSAV) , reject 
the job for swap-out. Otherwise, sat MAXJSN to job number of the 
first job found in the scan that is eligible for swap-out. 

At FORCS2, add the' size of this job to the total found so far 
(SUMCCR! . If tne job being swapped in still will not fit (?l>saMCOR) , 
go back and see if there are more jobs eiigiole to swap out (JRST 
(T2) ) . Otherwise, fall through to the coda described below. 

In the Class Scheduler mode, if a background batch job is being fit 

(38FIT f 0), clear SOJSWP and allow the primary scan pointer to 

advance on the next swap-in scan. Also, calculate the time at which 

the next background batch job is allowed to swap in (UPTIME ■<■ SCD83S) . 

Set J to MAXJ3N, the first job found in output scan, and therefore the 
lowest orioritv job. If the timer has expired (INFLG f 0) and the job 
is in a'orocessor queue, then set the job forced out ay timer flag 
(JS.TFQ)". 

Fall through to FORCQO with J * job to be forced out. 



4.13 FORCS0 TO S'WAPO SECTION 

This section determines whether a job can be swapped out, or if it 
must wait for I/O to finish or for sharabie resources to be given up. 

At FORC3Q , if tne job selected for swap out (J » ]ob^ number) is not 
runr.able with respect to CACHE, exit to FLGNUL. If it has a SAVE in 
progress, exit the swapper (JRST ( T2 ; ) without setting FORCER so that 
another joo will be selected .text tick. This is meaningful if it is 
entered from SCNGuT for expanding jobs. This has already teen checked 
if it was entered from 3C:.'JC3 aocve. 
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If the segment being swapped is a high segment (J>JOBMAX) , go to 
FORCEA. 

If the segment being swapped is a low segment, and the job was hung 
with I/O active (JS.HNG), go to SWAPO. (The remaining checks were 
already made before the hung indicator was set) . If the job is not 
marked, hung check for disk-sharable resources at FLSDR. From FLSDR, 
returns are: 

Return +1 No disk-sharable resources. Go to FORCEA. 

Return + 2 Job currently assigned one or more disk-sharable 
resources. Execute code described below. 

Save the job number of the segment being swapped (set FORCE equal to 
J) . Also, store an indicator that job is being forced with sharable 
resources (FORCEF = job number) . Go to FLGNUL to exit the swapper 
without swapping a job this tick. The scheduler will then run the job 
at highest priority until it gives up all sharable disk resources so 
it can be swapped. 

At FORCEA, check (FORHGH) to see if there is a high segment that can 
be swapped before this segment. True if this is a low segment that 
has a high segment in core (SWP = 0) that is not expanding, has a core 
count of one (this low segment) , and is not associated with the job 
being swapped in. If yes, return the high-segment number in J. If 
no, return low-segment number in J, and set the shuffle bit (SHF) to 1 
so that I/O will stop after the buffer full. 

At FORIDL, set swap bit (SW?) to 1. 

At FORCEL, save the job number of the segment to be swapped out into 
the force-out indicator (FORCE) . 

At FORCE1 (entered from SWP2 at the clock tick) , see if the job can 
now be swapped, as well as from above. If not forcing job with a 
disk-sharable resource (FORCEF = 0) , go to FORCES. Otherwise, check 
to see if the job still has resources (FLSDR).. If yes, exit the 
swapper at FLGNUL and check again next tick. If no, clear FORCEF and 
go back to FORCEA to complete steps that were delayed while waiting 
for job to give up resources. 

At FORCES, if the job has no core assigned (JBTADR (J) = 0) , go to 
SWAPO and swap out the next job (it cannot have any active devices) . 
Otherwise, check to see if the job is the current job (J « .C0JOB) . 
If yes, exit the swapper at NOFORC and wait for scheduler to context 
switch out of the job. if no, check for active devices or for the 
current job on the slave processor (ANYDEV) . If yes, exit the swapper 
at NOFORC. If no, fall through to SWAPO and swap out the job. 



4.14 SWAPO TO NOFIT SECTION 

This section puts the swap-out information in the SWPLST cables and 
calls the interrupt-level routines. 

At SWAPO, clear the output timer. If the segment being swapped is a 
low segment, delete the job from the output list" and the 
just-swapped- in list. Clear JS.XPN, JS.HNG, and JS.NNQ. Save the job 
number of the last job swapped out (set LASOEJT equal to J) , and clear 
the force flag (set FORCE equal to 0) . Clear the SW0JOB flag. If the 
job has zero core (JBTADR(J) = 0) , go to SWP1 to start a new 
operation, because there is no need to swap out the job. Otherwise, 
continue below. 
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Set AC to job input size (IMGIN) . If the segment is a low segment 
(J less than or equal to JOBMAX) , set the shuffle bit (SHF) to 1 to 
indicate that a swap-out is in progress. 

Set the segment output size (IMGOUT) from the input size (IMGIN, as 
stored in above) , unless the job expanded (IMGOUT f 0) , then leave 
it as set by the expand routine and set U to new segment (IMGOUT) . 

If the segment is a low segment, add the user page map size (UPMPSZ) 
to AC U (to be used in call to SWPSPC) . 

Save IMGOUT (in AC F) , set to zero for call to SWPSPC. Call SWPSPC. 

If there is no space, go to SWAP03 (to exit swapper and try again next 

tick) . Otherwise, get device storage space and restore IMGOUT to the 
saved value (AC F) . 

Save J, build SWPLST entry (at BOSLST) , add one to the number of swap 
operations in progress (SPRCNT) , start I/O if it is not already going 
(at SQOUT) , and restore J. 

If the job that was just swapped was a low segment that expanded, 
decrement the count of expanding jobs (XJ08) , clear the entry in the 
bit table (XPNCLR) , clear the table expand bit (JXPN) , and go to 
CHKXPN to swap out the expanding jobs, and when there are no more, 
exit from the swapper. 

At SWAP03, set IMGOUT to unless it is different from IMGIN, set 
FORCE to the job number, and go to FLGNUL. 



4.15 NOFIT TO NOFIT2 SECTION 



s~.< 



This section is entered every clock tic* that the jcb in FIT cannot 
swapped in, because there is not enough space even if all idle and 
dormant segments are deleted and all jobs that are eligible to be 
swapped out are swapped out. Recall that eligible to be swapped out 
implies that the jobs have expired their in-core protect time and are 
of lower priority in the swap-out scan than the job being swapped in. 

A timer keeps track of how many ticks the job has waited to swap in. 
After 6 seconds, the timer expires and sets a flag to indicate that 
the swap-out scan routine (SCNJOB) may now ignore the queue position 
and swap jobs out with expired in-core protect, even if they are of 
higher priority. 

This timer is needed only for very special cases. For example, if an 
HPQ job and a very large job both want to run and cannot fit in core 
simultaneously, then the large job will not displace the HPQ job until 
the timer expires, because the HPQ job is always higher in the queue. 
No known special cases exist for PQ1 and PQ2, because of the orderly 
operation of the Round Robin algorithm. 

At NOFIT, if the job selected for swap-in was a background batch job, 
deselect it (set FIT and 33FIT to 0) , reset the scan pointer as though 
no job were swapped in (at subroutine 5QTSST) , and go to FLGNUL. 

At NOFIT1, if the job being swapped in was preempted by an KPQ job, 
restore the timer to the value it held when the job was preempted and 
go to NOFIT7. 
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At NOFIT3, if the frustrated job is the same as last time 
(FITLOW - INFLGJ) , go to N0FIT7 . Otherwise, start the frustration 
timer for this job and go to FLGNUL. 

At NOFIT7, if the job being timed has been waiting 6 seconds, set the 
frustration flag (INFLG) to -1 and go to FLGNUL. 



4.16 NOFIT2 TO ZERFIT SECTION 

This section clears the FIT and BBFIT indicator if a job were selected 
to FIT and then the high segment it was connected to was expanded by 
some other job that is sharing it. (See Section 4.7.) 



4.17 ZERFIT TO NOFORC SECTION 

This section clears the FIT and BBFIT indicator if an HPQ wants to 

swap in and certain conditions have been met. (See Section 4.1.) It 

also stores the frustration time for the job being preempted in the 
?DB for that job (.PDHZF). 



4.18 NOFORC TO SWAP1 SECTION 

This section is entered every clock tick that the job in FORCE cannot 
be swapped out because it has active I/O or is the current job on some 
CPU. A* timer keeps track of how many ticks the job has been selected 
for swap-out. After 3 seconds, the timer expires. The job is 
deselected for swap-out, and is marked hung as far as swap-out is 
concerned. 

At NOFORC, if the job being swapped out is a high segment, exit to 
FLGNUL. If this job is the same as the previous job being timed 
(J « OUFLGJ), go to NOFORl. Otherwise, start the timer for this job 
and go to FLGNUL. 

At NOFORl, if the job being timed has been waiting for 3 seconds, set 
JS.HNG (so that the swapper will not select this job for swap-out 
again until I/O is no longer active). Clear FORCE, FORCEF, and 
OUFLGJ . 

On non-virtual-memory systems, if the selected job was expanding 
(JS.XPN = 1) , set JXPN to 1 and reenter the job in the table of 
expanding jobs (this is done because non-virtual-memory systems clear 
JXPN as soon as the job is selected for swap-out) . 



4.19 CHGSWP TO CHG1 SECTION 

This section changes disk-swapping space allocations (VIRTAL) . 

At CHGSWP, save the present input size (IMGIN) in T2. If the new core 
assignment is zero (Tl ■ 0) , go to CHG1; otherwise, continue below. 

Convert the new core assignment pages, store them in IMGIN, and save 
AC J. . 
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Comoute the change to the system's virtual address space and update 
the" indicator (VIRTAL) . 

Restore AC J and exit. 



4.20 CEG1 TO ONSWAP SECTION 

This section calls the subroutine (GIVBKH) to give back physical disk 
space. 

At CHG1, if the segment has no space on disk (T2 ~ 0) , go to 3SRSWP. 
Otherwise, increment VIRTAL by T2 (plus size of page map if it is a 
low. segment) . 

At ZERSWP, save AC a, and if the disk output size is (IMGOOT = 0) , 
go to CHG10. Otherwise, set up Tl and call ZERSWH. 

From ZERSWH, the returns are: 

Return + 1 Call GIVBKH, low segment or no error in hign 
segment (gives back disk space) . 

Return + 2 Restore 0, error in high segment or fall through 
from above. Fall through to UNSWAP. 



4.21 ONSWAP TO RTNDSP SECTION 

This section housekeeps job and swapper flags after a segment has oeen 
swapped in. 

At ONSWAP, clear the swap and shuffle bits (5WP and 3HF) . 

If the job just swapped was being forced (J = FORCE) , clear FORCE and 
FORCEF. 

At 0NSWP1, set the disk output size to (IMGOOT). For low segments, 
clear LH JBTSWP ( J) . 

Exit (POPJ P,) , 



4.22 RTNDSP TO GIVBKH SECTION 
This section returns disk space. 

4.23 GIVBKH TO XPAND SECTION 

This section clears the SWPLST entry and calls RTNDS? to return 
physical disk space. 
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4.24 XPAND TO XPANDH SECTION 

This section gets more core for a job by swapping it out and then 
swapping it back in again. 



4.25 XPANDH TO SCHED. SECTION 

This section stops a job and swaps it out if it has just been 
connects** to a 3uS'Suj.£ uigu segment that is on disk or is being 
swapped in or out. The job remains stopped until the high segment is 
in core. 
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CHAPTER 5 
SCHEDOLING PARAMETERS 



The scheduler contains a variety of control parameters that may be set 
by an installation to suit its particular needs. The non-class 
scheduler provides a basic set of parameters. The class scheduler 
provides a number of additional parameters. 

This chapter describes the location of the parameters and the default 
values assigned at start up. The default values may be modified by an 
installation as desired. Also, in the class scheduler, any parameter 
may be modified dynamically with a SCHED. monitor call (using the 
SCD5ET program) . 



5.1 PROCESSOR QOEOS TIME SLICES 

The processor queue time slices are made up of two parts: in-core 
protect time and quantum runtime. 

One of the following formulas determines the in-core project time (in 
ticks) for all processor queues. 

1. At swap in, in-core protect is 

min (PROTM, JOBSIZ*PROT+?ROT0+8333)/16667 

2. When requeued to back of PQ2 because of time-slice 
expiration, in-core protect time is 

PROT1 

The indicated ONCMOD tables indexed by the primary swapping device 
determine the default values for the in-core protect-time parameters. 
However, the indicated SCHED. monitor call may dynamically modify 
them. 



Scheduling 
Parameter 



ONCMOD 

Table 



SCHED. 
Monitor Call 



PROT 

PROTO 

PROTM 

PROT1 



PR0TT3 
PRTOTB 
PRTMT3 
PRTOTB 



PROT 
PROTO 
PROTM 
PROTl 
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To compute quantum runtimes, use the following formula: 

quantum run ■ rain <[QMX, QAD + (size of job in K) *QML] /QRANGE) 

where QMX, QAD, and QML come from tables QMXTAB, QADTAB, and QKLTAB 
indexed by processor queue, with index for PQl, 1 for PQ2, and 2 and 

following for HPQs. 

For PQl, the default values are set in COMMON and modified by 
SCHED. monitor calls as indicated. 



Scheduling 


COMMON 


SCHED. 


Parameter 


Parameter 


Monitor Call 


QADTAB (0) 


QQRON1 


TIME BASE 


QMLTAB (0) 





TIME MULTIPLIER 


QMXTAB (0) 


QQRUN1 


TIME MAXIMDM 



For PQ2, the default values are set from ONCMOD tables indexed by the 
primary swapping device and modified by SCH3D. monitor calls as 
indicated. 



Scheduling 


ONCMOD 


SCHED . 


Parameter 


Table 


Monitor Call 


QADTAB (1) 


ADDTA3 


TIME BASE 


QMLTAB (1) 


MULTAB 


TIME MULTIPLIER 


QMXTAB (1) 


MAXTAB 


TIME MAXIMUM 



For SPQs, the quantum runtimes are defined by macros at the location 
of QADTAB, QMLTAB, and QMXTAB in COMMON. The values generated depend 
on the number of HPQs. 3CSED. monitor calls cannot change HPQ quantum 
runtimes. All processor queues use QRANGE. It is set to tfie default 
value of 45K directly in COMMON, and may be modified by a Time 
Multiplier subf unction of the SCHED. monitor call. 

In-core protect and quantum runtimes have a different meaning for each 
of the processor queues. 



5.1.1 PQl Time Slice 

For PQl jobs, quantum runtime is a measure of the amount of time that 
the job receives exceptional (PQl level) attention for scheduling 
after it is swapped in. When this time expires, the job is requeued 
to the back of PQ2 (without being marked for swap-out) and is assigned 
the PQ2 quantum runtime. A PQl job is assigned the same in-core 
protect time as PQ2 jobs when it is swapped in. On requeue to PQ2, it 
retains any leftover in-core protect time. 

This procedure gives fast scheduling response to PQl jobs that require 

to run after expiring the PQl quantum runtime. (Once a job is 
swapped, it is allowed to run at least as long as the ?Q2 time slice, 
if it does not go into long-term wait.) 
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5.1.2 PQ2 Time Slice 

For PQ2 jobs, the parameters for in-core protect and quantum runtime 
control the bias of the scheduler for throughput versus response and 
for I/O versus CPU. 

Throughput versus response is controlled by increasing or decreasing 
the magnitude of both parameters. As the parameters are increased, 
jobs expire their time slices more, slowly, swapping rate decreases, 
and throughput is improved (less core is tied up in swapping) . 
Response is correspondingly degraded because jobs wait longer to swap 
in. When you decrease both parameters the effect is reversed. 

I/O versus CPU response is controlled by changing the ratio of in-core 
protect to quantum runtime. Increasing only quantum runtime favors 
CPU jobs, increasing only in-core protect favors I/O jobs, while 
reducing it tends to favor CPU-bound jobs. 



5.1.3 HPQ Time Slice 

For HPQ jobs, quantum runtime is set to a very small value so that if 
more than one *HPQ job wants to run, the scheduler will context switch 
between jobs frequently. 

The value of in-core protect time for HPQ jobs is the same as for PQ2 
jobs. Normally, this is not significant because HPQ jobs can only oe 
swapped out by other HPQ jobs. (It would be significant if an 
installation wanted to allow two HPQ jobs that did not fit to be in 
memory simultaneously.) 

PQ1 and PQ2 jobs do not normally' replace HPQ jobs, oecause the 
swapping output scan does not swap out a job of higher priority than 
the job coming in (even if the job in core has expired its in-core 
protect time) . The only exception is if the 6-second fairness timer 
exDires. 



5.2 SWAPPING AND SCHEDULING FAIRNESS COUNTS 

The PQ1 versus PQ2 swapping and scheduling fairness counts are defined 
in COMMON with the default values listed in Table 5-1. They may be 
modified with the indicated functions to the SCHED. monitor call. 



Table 5-1 
Default Values of Swapping and Scheduling Fairness Counts 



Parameter 



Description 



Default SCHED. Monitor Call 



IFCO Swapping Threshold 

SFCO Scheduling Threshold (CPUO) 

SFC1 Scheduling Threshold (CPUl) 



5 Swapper Fairness 
20 Scheduler Fairness 
20 Scheduler .Fairness 



Note that the SCHED. monitor call sets both CPUs to the same 
scheduling fairness threshold in a dual-processor system. 



5-3 



SCHEDULING PARAMETERS 

The scheduling and swapping fairness counts are a measure of the 
number of consecutive times the scheduling/swapping scan has selected 
a PQ1 job. After a specified threshold has been reached, a PQ2 job is 
selected, if available, by scanning with an alternate scan table that 
has PQ2 ahead of PQ1. Small threshold values favor PQ2. Large values 
favor PQ1. .. 



5.3 IN-CORE FAIRNESS FACTOR 

The in-core fairness factor, SCDIOF, is set to an initial value of 50% 
in SYSINI. It may be modified with the -Incore Fairness subfunction of 
the SCHED. monitor call. 

The in-core fairness factor determines the percentage of time that PQ2 
jobs that have done a GETSEG and have not yet expired 1 time slice are 
scanned for swap-in ahead of regular PQ2 jobs. 

This is the last of the scheduling parameters for the non-class 
scheduler. The following parameters apply to the class scheduler 
only. 



5.4 CLASS QUOTAS AND MICROSCHEDULING INTERVAL 

The class quotas are made up of the following three sets of 
parameters: 

1. Primary percentages. 

2. Secondary allocations. 

3. Fixed swapping indicators. 

The table CLS5TS stores the primary percentages as well as the fixed 
swapping indicators. The packed table PSQTA3, however, only stores 
the primary percentages. The initial values of both of these tables 
are zero at start up. The primary percentages and fixed swapping 
indicators are modified with the primary Percentage subfunction of the 
SCHED. monitor call. 

The table CLSQTA stores the secondary allocations. The initial value 
of this table is zero. The. secondary allocations are modified with 
the Secondary Allocation subfunction of the SCHED. monitor call. 

Item SCDINIT stores the microscheduling interval. The initial value 
is zero. It is modified by the Micro Scheduling interval subfunction 
of the SCHED. monitor call. 

The default values of zero for the above parameters cause the system 
to start up in Round Robin mode. To enter Class Scheduler mode, the 
parameters must be set with the SCDSET program. 

The system enters Class Scheduler mode whenever the following 
conditions are met: 

1. The primary percentages add to 100%. 

2. The microscheduling interval is nonzero. 

Conversely, the system enters Round Robin mode if either of the above 
conditions is not met. 
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The primary percentages define the amount of system resources granted 
to each class. The secondary allocations define the proportion of 
leftover resources allocated to each class. Leftover resources occur 
when some of the classes do not use all of their primary percentages. 

If a class has a zero primary percentage, it is not guaranteed any 

portion of the machine. If it has a nonzero secondary allocation, it 

will get a share of leftover resources; if not, it will not be 
swapped or scheduled at all. 

If a class has a nonzero primary percentage and a zero secondary 

allocation, it will be swapped and scheduled only a fixed amount of 

time. In other words, it will get exactly its primary percentage and 
no more. 

The fixed swapping indicator causes a class to be swapped at a fixed 
rate, but scheduled as though it were nonfixed. This assumes that the 
class has a nonzero primary percentage and a nonzero secondary 
allocation. The class is swapped using only the primary percentage, 
ignoring the secondary allocation as though it were zero. Scheduling 
uses both the primary percentage and the secondary allocation. 

This feature defines classes that will be treated as fixed classes as 
long as there are other classes swapping in and out. When there are 
no other classes to force them out, the fixed swapping class will 
remain in memory and be scheduled ahead of background batch. 



5.5 BACKGROUND BATCH PARAMETERS 

Background batch is controlled by two parameters: background batch 
class and background batch swap time. 

Background batch class is stored in parameter 3BS0BQ. The initial 
value of -1 is set in SYSINI. It may be modified with the Background 
Batch Class subf unction of the SCHED. monitor call. 

Background batch swap time is stored in parameter SCDBBS. Tne initial 
value of zero is defined in COMMON. It may be modified with the 
Background Batch Swap Time subf unction of the SCHED. monitor call. 

Any class may be designated as the background batch class. In 
general, it has a zero primary percentage and a zero secondary 
allocation, but this is not a restriction. If background batch has a 
primary percentage, it is guaranteed a certain level of response. If 
it has a secondary allocation, it is allowed a share of leftover 
resources. In any event, it is also scanned whenever there are no 
other classes to run. The negative initial value implies there is no 
defined background batch class. 

The background batch swap time defines tne rate in ticks at which 
background batch jobs can be swapped. In situations where the 
timesharing load fluctuates between existence and nonexistence of 
timesharing jobs, it can be used to prevent thrashing. 



5.6 RESPONSE FAIRNESS FACTOR 

The response fairness factor is stored in parameter SCDJIL. The 
initial" value of 10% is set in SYSINI. It may be modified with the 
Response Fairness subf unction of the SCHED. monitor call. 
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The response fairness factor defines the percentage of time that jobs 
are scheduled in the order in which they were swapped in versus 
scheduling by the class scheduling scan. (A list of jobs just swapped 
in is maintained for all jobs that have not yet expired 1 time slice.) 

A value of 100% gives the best possible short-term response with 
reduced accuracy when jobs do not exist on the swapper in sufficient 
numbers to satisfy the desired primary percentages. 

A value of 1% gives the best possible accuracy with reduced response 
when many jobs in memory are in classes that are rarely scheduled. 

The range of acceptable values for response fairness factor are from 
1% to 100%. Values of 10% and above are recommended for acceptable 
short-term response. A zero, value is not allowed. 



5.7 AVERAGE SWAP TIME 

The average swap time is stored in variable SCDSWP. It may be 
modified with the Average Swap Time subf unction of the SCHED. monitor 
call. The initial va^ue is calculated in ONCMOD by multiplying the 
time it takes to swap one page by the specified average job size, 
PAVJSP, and adding in the swapper latency time. The time required to 
swap one page depends on the speed of the installation's swapping 
device. 

The default value for PAVJSP is 20 pages, or 10K. 

The average swap time is used to calculate when the swapper should 
advance to the next class in the primary table when there are no jobs 
in the system to swap. This parameter is required to achieve correct 
swap-in rates for fixed classes when there are no jobs in any other 
classes. Fixed classes have no secondary allocations. In fact, tfley 
can only swap in when the primary percentage pointer has been advanced 
to an entry for their class. 



5.8 JOB CLASS 

The class to which each job belongs is stored by job number in bits 14 
through 17 of the table JBTSCD. The initial value at system start up 
is all zeros. 

The job's class is set by LOGIN using the Job Class subf unction of the 
SCHED. monitor call. It can also be'set by the SCDSET program. 



5.3 CLASS ROKTIME 

The class runtimes are set by the monitor and are read by the Runtime 
subfunction of the SCHED. monitor call through the SCDSET program. 
The values are reset to zero whenever the primary percentages are 
changed. 
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CHAPTER 6 
DETERMINATION OF PARAMETERS FOR SCHEDULER 



This chapter uses the scheduler as an example of how the default 
scheduling parameters are determined. This chapter also discusses the 
rationale behind a choice of parameters for a specific system. 

The western Michigan University (WMO) computer system is a KI10 with 
160K of memory, six RP02 disk drives, and two RPQ3 disk drives (on one 
channel) , and two RDIO swapping disks (on a second channel) , and two 
TU20 tape drives (on the I/O BUS) . The system is configured for 74 
jobs. The monitor is 6.02A with virtual-memory option (the swapper 
and scheduler are modified to be equivalent to the WMO class scheduler 
in 6.03) . 

The job mix is made up of a wide variety of programs. Compilations 
are primarily 3ASIC~ and FORTRAN with a fair amount of COBOL, MACRO, 
and ALGOL. User programs and system library programs cover- many areas 
including simulation, mathematics, statistics, engineering, chemistry, 
physics, management, and so forth. 

Most activity is terminal oriented. Of the 74 job slots, 3 are 
allocated for BATCH. The maximum user core is 35K during prime time. 
The average job size is 10K. The majority of jobs are relatively 
small and conversationally oriented (that is, TECO, LINED, and small 
student programs) . There are a fair number of large jobs tftat make 
heavy use "of the CPU and/or disk I/O (large compilations, STATPACK, 
and virtual-memory jobs) . 

The overall performance objectives are: 

1. To provide good response to conversational jobs (PQl) . 

2. To maintain a reasonable level of system throughput for 
system and I/O users. 

3. To provide a good balance of CPU versus I/O jobs in core so 
that multiprogramming is effective over a wide range of job 
mixes. 

The discussion of specific parameter values in this chapter parallels 
the general discussion in Chapter 5. For each section in Chapter 5, 
there is a corresponding section in this chapter describing how the 
parameters are determined for the scheduler. 
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6.1 PROCESSOR QUEUE TIME SLICES 

For in-core protect time and quantum runtime, the ONCMOD tables are 
indexed by "an RD10 as a primary swapping device. The values 
referenced in OKCMOD tables and the values transferred to scheduling 
tables are indicated in the. following. 

Table 6-1 lists the parameter values for in-core protect time. 



Table 6-1 
In-Core Protect-Time Parameter Values 



Scheduling Parameter 



ONCMOD Parameter 



Name Value 



Onits 



Name Value 



Units 



PROT 





microseconds 


PROTTB 





microseconds 


PROTO 


3000QOO 


microseconds 


PRT0TB 


3000000 


microseconds 


PROTM 


3000000 


microseconds 


PRTMTB 


3000000 


microseconds 


PROT1 


130 


ticks 


PRT0TB 


3000000 


microseconds 



These values imply a fixed 3-second in-coce protect time for all jobs, 
regardless of "job size, both at swap- in and when requested for 
time-slics expiration. 

If desired, PSCTTB could be set nonzero to vary the assignment at 
swap-in by job size. PRTMTB would need to be modified also to define 
the maximum allowed value. 



Table 6-2 lists the quantum runtime parameter values for PQl 
are generated directly in the scheduling tables in COMMON. 



which 



Table 6-2 
PQl Quantum Runtime Parameter Values 



Scheduling Parameter 



COMMON Parameter 



Name 



Value Units 



Name Value Units 



QADTA3(0) 
QMLTA3(0) 
QMXTAS(O) 



ticks 
ticks 
ticks 



QQRUN1 



QQRUN1 



8 


ticks 





ticks 


8 


ticks 



These values imply a fixed 8 ticfcs for all PQl jobs, regardless of job 
size. 

These values may be changed by inserting the new values directly in 
COMMON, or by inserting code in ONCMOD to set up the values. 
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Table 6-3 lists the quantum runtime parameter values for PQ2. 



Table 6-3 
PQ2 Quantum Runtime Parameter Values 



Scheduling Parameter 



ONCMOD Parameter 



Mama 


Value 


W**=fa Wed 


\T4MA 

i* awe 


TT-.1 .,* 


Units 


QADTAB(l) 


45 


ticks 


ADDTAB 


750000 


microseconds 


QMLTAB(l) 


45 


ticks 


MULTAB 


750000 


microseconds 


QMXTAB(l) 


90 


ticks 


MAXTAB 


1500000 


microseconds 



The value of QRANGE in COMMON is 45K. 

The values imply a base quantum runtime of 0.75 second for a IK job. 
This grows one tick per K of job core size to a maximum of 1.5 seconds 
for a 45K job. Thereafter, it is a fixed 1.5 seconds. Because the 
average job size at WHO is about 10K, the averaae PQ2 auantura runtime 
is approximately 1 second. 

Table 6-4 lists the quantum runtime parameter values for HPQs, which 
are generated directly in the scheduling tables in COMMON. 



Table 6-4 
HPQ Quantum Runtime Parameter Values 



Name 



Value 



Units 



QADTAB(2) 


2 


ticks 


QMLTAB(2) 





ticks 


QMXTAB(2) 


2 


ticks 



This implies a fixed quantum runtime of 2 ticks for all HPQ jobs, 
regardless of size. 

The rationale for each of the processor queue time slices is as 
follows. 



6.1.1 PQl Time Slice 

In PQl, the quantum runtime of 8 ticks allows very fast response for a 
very short period of time. In-core protect" time is a constant 3 
seconds. 

At WMU, most PQl jobs finish processing and return to long-term wait 
within the 8 ticks allowed by the PQl quantum runtime. Table 6-5 
lists the number of PQl jobs blocking to long-term wait as a function 
of time. 
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Table 6-5 
Percent of PQi Jobs Blocking to Long-Term Wait as Function of Time 



Percent 
Blocking 

50% 



95% 



CPU Ticks Used 

Less than 6 ticks (1/10 second) 

Less than 20 ticks (1/3 second) 

Less than 50 ticks (5/6 second) 



PQI jobs that do ran long enough to expire their time slices are 
requeued to PQ2, assigned a PQ2 amount of quantum runtime, and retain 
their remaining in-core protect time. This reduces their response 
priority to the level of PQ2, but allows the job to compute at least 
as long as a PQ2 time slice. 

As Table 6-5 shows, less than 5% of the PQI jobs compute long enough 

to use the additional PQ2 time slice. For those that do, a small 

reduction in swapping rate is achieved with little impact on the other 
jobs in PQ2. 



6.1.2 PQ2 Time Slice 

For PQ2 jobs, the quantum runtime is 0.75 second to 1.50 seconds, 
depending on job size. In-core protect is a fixed 3 seconds. These 
values give good response, low overhead, and optimum balance between 
CPU and I/O-bound jobs. 

Good response is achieved when the PQ2 time slice is small enough so 
that jobs swapoing in can find sufficient space in memory to come in 
(free space or jobs with expired time slices) . One measure of good 
response is that the swapper can achieve full speed during periods of 
heavy demand for short-term response. Another measure is the average 
swap time required to swap in a PQI job, that is, the time from wnen 
the job enters PQI to the time it is swapped in. 

The scheduler overhead increases as time-slice parameters are made 
smaller. Also, the PQ2 swapping rate goes up, making less swapper 
capacity available to PQI jobs. 

The goal is to make the PQ2 time slice small enough to allow good 
response, and large enough to achieve low overhead and low ?Q2 
swapping rate. 

A second goal is to make the ratio of in-core protect to quantum 
runtime such that an optimum balance is achieved between CPU and I/O 
jobs. This can be measured bv looking at percent CPU utilization 
versus utilization of the "disk system. Disk rates are measured in 
terms of the number of disk blocks transferred. 



CPU utilization, disk rates, swapper rate, 
swap rate, and FQ2 swap rate can be 
performance analysis package, 
optimum values are in use. 



swap times, overhead, PQI 

monitored with the system 

This has been done to ensure that 

A reevaluation is cone periodically, 



because the system load characteristics change over time. 

To illustrate the importance of a proper ratio of in-core protect time 
versus auantum runtime, a test was run with simulated jobs to show the 
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effect of incorrect parameters. The job mix contains an equal number 
of CPU and I/O-bound jobs. The same job mix was run with two 
different monitors, one with WMU standard parameters (approximately 3 
to 1) , and one with incorrect parameters (approximately 1.5 to 1) . 

The test results are tabled below: 

Table 6-6 
Example of Effect of Incorrect Parameters 

Monitor CPU Disk Blocks/Minute 

Standard Parameters 94% 3184 

incorrect Parameters 92% 2219 

The standard parameters produced better I/O rate with no decrease in 
CPQ utilization. 

Note that the PQ2 time slice is sufficient to slow the PQ2 swapping 

rate, but is not sufficient to bring the PQ2 swapping rate up to a 

minimum level. To accomplish this, the swapping and scheduling 

fairness counts are necessary. This is discussed in Section 6.1.3. 



6.1.3 HPQ Time Slice 

HPQ jobs are assigned quantum runtimes of 2 ticks and in-core protect 
times of 3 seconds. The extremely small quantum runtimes allow very 
fast alternation between HPQ jobs. The in-core protect times are 
immaterial, because WMU never has more HPQ jobs than can fit in core 
at once. 



6.2 SWAPPING AND SCHEDULING FAIRNESS COUNTS 

Table 6-7 lists the default values for swapping and scheduling 
fairness counts chat are used at WMU. 

Table 6-7 
Default Values for Swapping and Scheduling Fairness Counts 

Parameter Description Value 

IFCO Swapping fairness 5 

5FC0 Scheduling fairness (CPUO) 20 

SFC1 Scheduling fairness (CPU1) 20 

The swapping and scheduling fairness counts prevent PQ1 jobs from 
taking over the system. PQ1 jobs are swapped in and scheduled ahead 
of PQ2 jobs. If they exist in sufficient numbers, they can fill 
memory and take over the system. The fairness counts allow PQ1 jobs 
to have the highest priority up to a limit. After that, PQ2 jobs have 
Driority. 
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For good response, PQ1 should get the majority of swapper capacity. 
Assuming the swapper is operating at 100% capacity, a good goal" is 80% 
for PQl jobs and 20% for PQ2 jobs. This allows good response for PQ1 
jobs and provides good system throughput for PQ2 jobs. 

If PQ1 jobs are not restricted by fairness counts, system throughput 
will be severely degraded during periods of heavy demand for 
short-term response. This is because PQl typically blocks to 
long-term wait very quickly after swap-in. Without restraint, memory 
becomes filled with jobs that are not runnable. The swapper cannot 
swap thes out as fast as they expire. In this case, CPU utilization 
goes down and lost time goes up. 

There are two direct measures of fairness. First, PQ2 jobs should get 
at least a certain minimum of swapping capacity whenever there are 
sufficient numbers of PQ2 jobs in the system. Second, the machine 
should not be filled with unrunnable jobs (that is, jobs in long-term 
wait, which are generally expired PQ1 jobs) . Both of these variables 
can be measured with the system performance analysis package. 

At WMU, the PQ2 swapping rate is approximately 20% when sufficient 
jobs exist and the swapper is operating at capacity. The average 
amount of core occupied by unrunnable jobs is approximately 20 pages 
(10K) out of a total user area of 91X. 

To illustrate the effect of fairness counts on the vJMU system, a set 
of simulated jobs was created containing a mix of PQl and PQ2 jobs 
similar to that seen on the real system. Performance was measured for 
a wide range of swapping fairness counts. (See Table 6-8.) 

Table 6-8 
Example of Effect of Fairness Counts 

Number of PQi Jobs Number of FQ2 Jobs 

Swapped in Swapped in 

per Minute per Minute 

101.2 3.4 

98.9 6.2 

96.2 10.7 

96.1 19.1 

88.5 29.6 

64.9 64.3 

The data shows that as more PQ2 jobs are swapped in (fairness 

threshold is lowered) , the CPU utilization is increased. At the same 

time, che PQl swapping rate is decreased, showing a corresponding 
impact on short-term response. 

The value of 5 for swapping fairness was chosen at WMU because it 
produces good ?Q2 throughput with very little impact on short-term 
response (PQl swapping rate) . Scheduling fairness was arbitrarily sec 
to 20. In most cases this has little effect, because PQl jobs 
typically expire so fast that PQ2 jobs run without the need for 
scheduling fairness. 



Swapping 
Fairness 


CPU 

Utilization 


30 




29.9 


16 




37.7 


9 




48.7 


5 




61.7 


3 




72.2 


1 




91.1 
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6.3 IN-CORE FAIRNESS FACTOR 

WMU uses the default value of 50% for the in-core fairness factor. 
This gives good response to jobs that do GETSEGs without allowing them 
to take over the swapping. 

Values below 50% are not recommended because too many low segments 
would exist in memory in an unrunnable state. 



6.4 CLASS QUOTAS AND MICROSCEEDULING INTERVAL 

The WMU class scheduler comes up to Round Robin mode. The SCDSET 
program is run shortly after start up with an OPSER.ATO file to place 
the scheduler in Class Scheduler mode. The file defines primary 
percentages of 95% for class 0, and 5% for classes 1 and 2. Secondary 
allocations and fixed swapping bits are set in a variety of 
permutations to test the response and accuracy of the class scheduler. 

The microscheduling interval is set to 30 ticks or 0.5 second. 

WMU tested values for the microscheduling interval in the range 1 to 
60 ticks. The smaller values gave the best accuracy and smoothest 
response, in this range, there was no measurable difference in 
scheduler overhead. 



6.5 BACKGROUND BATCH PARAMETERS 

The WMU system starts up with the default value -i for background 
batch class and for background batch swap time. The SCDSET program, 
which runs at start up, defines class 15 as the background batch, class 
with a background batch swap time of 120 ticks, or 2 seconds. Primary 
percentage and secondary allocation are both 0. 

Values of 60 through 130 ticks were tried in live operation under 
various system loads. The value of 120 ticks appears to adequately 
prevent thrashing. 



6.6 RESPONSE FAIRNESS FACTOR 

The WMU system is assembled with the default value of 10% as the 
response fairness factor. This is overridden at start up by tne 
SCDSET program to a value of 100%, which produces the best possible 
response at all times. 

WMU has tried a range of values from 1% to 100% on the live system 
under a wide variety of loads. A value of 10% gives good response 
with very good accuracy. Values below 10% produce poor response, and 
are not recommended. Values above 10% did not noticeably improve 
response, but did reduce accuracy. 

The present value of 100% is arbitrary. WMU is presently more 
concerned with response than accuracy. 
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6.7 AVERAGE SWAP TIME 

The WMU system is assembled with the average job size, PAVJSP, set to 
the default value of 20 pages, or 10K. With the WMU primary swapping 
device, an RD10, this yields an average swap time of 9 ticks. 

This value of the parameter gives very accurate allocation of time to 
fixed classes during periods when no other classes ara present. 



6.8 JOB CLASS 

The WMU system begins operation with all job classes set to default 
values of 0. Jobs are placed in the appropriate class as they log in. 
WMU uses classes 0, 1, 2, and 15. 



6.9 CLASS RUNTIME 

WMU class runtimes are set to the default values of at start up. 
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CHAPTER 7 
DETAILED DESCRIPTION OF SCHED. MONITOR CALL 



This section describes the macro code for the SCHED. monitor calls. 
This code is included in the class scheduler only. 



7.1 SCHED. TO SCDQTA SECTION 

The scheduling parameters are defined by the system administrator 
through the SCDSET program, which uses the SCHED. monitor calls to 
store the parameters in the monitor data base. The SCHED. monitor 
calls store most parameters and retrieve all parameters. A 
description of the detailed code for each of the SCHED. monitor calls- 
follows. 

At SCHED. the argument block for the SCHED. monitor call is 

interpreted and checked for legality. A dispatch is made to the 

appropriate read or write routine based on the function code. 

Functions 1, 4, and 8 are not used in the WMO class scheduler. 



7.1.1 Function 

Routine SCHRSI reads the microscheduling interval (SCDINT) . 

Routine SCHWSI writes SCDINT. It also forces a new scheduling 
interval to begin. If the microscheduling interval goes to zero, the 
scheduler is placed in Round Robin mode by clearing RRFLAG. 



7.1.2 Function 2 

Routine SCHRQT reads the primary percentages for each class up to the 
class specified in the argument block. First, check the class number 
for legality. Then, for each class up to that number, load the 
primary percentage and status bits from table CLSSTS, and store them 
in the user-specified area. 

Routine SCHWQT stores the primary percentages for any number of 
classes. The first argument specifies the number of classes to be 
stored. Each following argument contains the class number and status 
bits in the left half, and the primary percentage in the right half. 
First, check the class number for legality. Then, store the primary 
percentage and status bits in table CLSSTS. After all of the 
arguments have been processed, zero the table of runtimes by class 
(CLSRTM) . 
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At SCHWQ2, build a table of all classes with positive primary 
percentages in SQSCAN. Each entry in the table is of the form XWD 0, 
class number. Store the total number of classes with primary 
percentages in CNTSTS . If no classes have a primary .percentage or the 
percentages do not add to 100%, place the scheduler in Round Robin 
mode by clearing RRFLAG, and leave SCHWQT. 

At SCHWQ6, pick the next class to be entered into the primary scan 
table PSQTAB. If there is only one class, go to SCHWQ9 to store that 
_i ^j.w — -..4 <»» 4>tan«tna tjHi^w !~ 1 a <= a 1= m<■><^^ nuprdua to be sicked 

at SCHWQ7. For each class in SQSCAN, add its primary percentage to 
the relative priority, which is stored in the left half of the SQSCAN 
table. Weight the relative priority by multiplying by the class 
primary percentaae, and if the product is the largest seen so far, set 
AC PI to point to this class. Repeat from SCHWQ7 until all classes 
have been tested. 

AtSCHWQ9, store the selected class as the next entry in PSQTAB. 
Also, subtract 100% from its relative priority to reflect the fact 
that it is no longer overdue to be selected. Repeat from SCHWQ6 until 
all 100 entries have been stored in PSQTAB. Set entry 101 to entry 1 
for use by CPU1. 

This algorithm guarantees that each class will be selected the number 
of times specified by its primary percentage. Also, this algorithm 
spaces the entries optimally If each percentage is a multiple of tan, 
and does a very good job on inost other cases. 



7.1.3 Function 3 

Routine SCHRTS reads the base quantum runtimes for either PQ1 or ?Q2. 

Routine 5CHWTS stores the base quantum runtimes for ?Q1 or PQ2 in the 

QADTAB table. The first word in the argument block specifies the 

number of arguments to follow. A code 1 in the left half of the 

argument SDecifies PQ1, and a code of 2 in the left half of the 

argument specifies PQ2. The right half of the argument contains the 
new value for the base quantum runtime. 



7.1.4 Function 5 

Routine SCHRJC reads the class numbers for all jobs in the system up 
to the job specified in the argument block. First, check the job 
number for legality. Then, for all jobs up to that job number, load 
the job's class from table JBTSCD and store it in the user-specified 
area. 

Routine SCHWJC places any number of jobs into their proper scheduler 
classes. The first argument specifies the number of jobs to be 
reclassified. Each following argument contains the job number in the 
left half and the new class number in the right half. First, make 
sure that the job number is valid and that the job is logged in. 
Then, check the class number for legality and store the new class 
number in table JBTSCD. If the job is in PQ2, set the changing 
subqueue bit (JS.CSQ) and requeue the job. 
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7.1.5 Function 6 

Routine SCHRMC reads the constant added to in-core protect time 
( PROTO} . 

Routine SCBWMC writes PROTO, 



7.1.6 Function 7 

Routine SCHRCT reads the runtime used by each class up to the* class 
specified in the argument block. First, check the class number for 
legality. Then, for each class up to that number, load the runtime 
used by that class from table CLSRTM and store it in the 
user-specified area. Runtimes are stored in ticks and represent the 
CPU time used in PQ2 since the primary percentages were last changed. 
The write option is illegal for function 7. 



7.1.7 Function 9 

Routine SCHRPF reads the multiplier used to calculate in-core protect 
time (PROT) . 

Routine SCHWPF writes PROT. 



7.1.S Function 10 

Routine SCHRCD reads the default class for a new .job (DEFCLS) . 

Routine SCHWCD sets DEFCLS. 

7.1.9 Function 11 

Routine SCERRC reads the constant used for assigning in-core protect 
time on requeue because of time-slice expiration (PROTl) . 

Routine SCHWRC writes PROTl. 

7.1.10 Function 12 

Routine SCHRPM reads the maximum value of in-core protect time 
(PROTM) . 

Routine SCHWPM writes PROTM. 

7.1.11 Function 13 

Routine SCHRPM reads the in-core protect time constant (PROTO) . 
Routine SCHWRC writes PROTO. 
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7.1-12 Function 14 

Routine SCHRML reads the quantum multipliers for either PQ1 or ?Q2, or 
the scale factor used in calculating quantum runtime (QRA.NGE) . 

Routine SCHWML stores the quantum multipliers for PQl or PQ2 into the 
QMLTAB cable. As in function 3, a code of 1 specifies PQl and a code 

of 2 specifies PQ2. A code of 3 specifies a new value for QRANGE. 



7.1.13 Function IS 

Routine SCHRMX reads the maximum quantum runtimes for either PQl or 
PQ2. 

Routine SCHWMX stores the maximum quantum runtimes for PQl or PQ2 into 
the QMXTAB table. A code of 1 specifies PQl and a code of 2 specifies 
PQ2 . 



7.1.14 Function 16 

Routine SCHRSQ reads the secondary allocations for each class up to 
the class specified in the argument block. First, check the class 
number "for legality. Then, for each class up to that number, load the 
secondary allocation from table CLSQTA and store it in the 
user-specified area. 

Routine SCHWSQ stores the secondary allocations for any number of 
classes. The first argument specifies the number of classes to be 
stored. Each following argument contains the class number in the left 
half and contains the secondary allocation in the right half. First, 
check the class number for legality. Then, store the secondary 
allocation of the class in the table CLSQTA. After all arguments have 
been processed, score the number of the highest class with a positive 
secondary allocation in MAXQTA. 



7.1.15 Function 17 

Routine SCERIQ reads the response fairness factor (SCDJIL) . 

Routine SCHWIQ writes SCDJIL. The value is a percentage and must be 
positive. 

7.1.16 Function 18 

Routine SCHRSS reads the average swap-time estimate (SCDSWP) . 
Routine SCHWSS writes SCDSWP. The value is specified in ticks. 

7.1.17 Function 19 

Routine SCHRBB reads the background batch class (3BSUBQ) . 

Routine SCHWBB writes BBSUBQ. The value must be a legal class number 
or -1 if no background batch is desired. 
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7.1.18 Function 20 

Routine SCHRBS reads the background batch swap-time interval (SCDBBS) . 
Routine SCHWBS writes SCDBBS. The value is specified in ticks. 

7.1.19 Function 21 

Routine SCHRSF reads the scheduler fairness factor for CPUO. 

Routine SCHW5F writes the scheduler fairness factor for CPUO. The 
same value is stored for CPOl if it exists. The value must be 
positive. 

7.1.20 Function 22 

Routine SCHRSW reads the swapper fairness factor (MAXIFC) . 
Routine SCHWSW writes MAXIFC. The value must be positive. 

7.1.21 Function 23 

Routine SCHRIO reads the in-core fairness factor (SCDIOF) . 

Routine SCHWIO writes SCDIOF. The value is a percentage and must be 
positive. 

7.1.22 Function 24 

Routine SCHRCS reads the core scheduling interval {SCDCOR) . The value 

is converted to seconds before being returned to the user. SCDCOR is 

used to determine whether in-core protect times are used in 
scheduling. 

Routine SCHWCS converts the user argument from seconds to tick-pairs, 
and stores the result in SCDCOR. 

7.2 SCDQTA TO SCDQT7 SECTION 

This section checks for the end of the microscheduling interval and 
performs all necessary functions when the interval expires. Routine 
SCDQTA is called once every tick. 

If no microscheduling interval is defined (SCDINT=0) , or no primary 
classes are defined (CNTSTS=0) , return immediately because the 
scheduler is operating in Round Robin mode. Otherwise, set RRFLAG 
nonzero to cause the scheduler to operate in Class Scheduling mode. 

If the current microscheduling interval is not yet over 

(U?TIMS<SCDTIM) , return. Otherwise, store the end of the new 

microscheduling interval in SCDTIM. Store the time at which response 
fairness is no longer in effect in SCNJIL. 
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DETAILED DESCRIPTION OF SCHED. MONITOR CALL 

Advance, the primary scan pointers to the next class for both CPUs. 
For each CPU, load the primary class into AC Tl and the address of the 
subqueue scheduling scan table into T4, and call SCDSST to build the 
scan table. 

At SCDSST, set the first entry in the scan table to the primary class. 
Suild the secondary scan table in the remaining locations of the scan 
table. All classes with secondary allocations except the primary 
class are entered into the table in the form: XWD class, secondary 
allocation. The sum of the secondary allocations of all classes 
entered into the secondary scan table is accumulated in SSSUM. 

At SCDQT4 select a random integer in the range G to 1 SS5UM-1. This 
integer determines which class will be selected next for insertion 
into the subqueue scheduling scan table for, the microscheduling 
interval. The secondary allocations of each entry in the secondary 
scan table are successively subtracted from the random integer until 
it goes negative. The class that causes it to go negative is selected 
as the next class to insert into the subqueue scan table. Thus, the 
probability of any given class being selected is equal to its 
secondary allocation divided by the total of all remaining secondary 
classes "(SSSUM) . 

Eliminate the selected class from further consideration by moving the 
bottom entry ud on top of it, and by subtracting its secondary 
allocation from SSSUM. Store the selected class as the next entry in 
the scan table. Repeat from SCDQT4 until all entries in tne secondary 
scan table have been incorporated into the subqueue scheduling scan 
table. A zero terminates the table: 
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1.0 GENERAL DISK I/O FLOW 

This discussion assumes that the disk request has been processed to 
the point that an I/O list has been built and the initial sector 
address is known. The disk is not necessarily on the correct 
cylinder. The following flow describes the general processing; 
subsequent text will des'cribe it more fully. 

1- Calculate required cylinder. 

2- If seek required then 

3- If data transfer in progress 
(non-massbuss device or massbuss 

device with active transfer on this unit) 

4- Queue request to this unit 

5- Exit 

6- Else 

7- Start seek 

8- Exit 

9- Else 

10- If data transfer in progress then 

11- Queue request for channel 

12- Exit 

13- Else 

14- Disable attention interrupts 

15- Start transfer 

16- Exit 

17- End 

18- On interrupt 

19- Read drive number and register # from RH 

20- Read attention summary register 

21- For each on-bit in summary register do 

22- If corresponding drive was not transferring 

23- If seek complete then queue request for channel 

24- Else process status (eg. drive coming on line) 

25- If data transfer complete then 

26- If hardware detected error then 

27- Perform error recovery 

28- Compare channel termination with predicted termination 

29- If software detected error 

30- Perform error recovery 

31- For each unit with queued requests do 

32- Select next seek and start it 

33- ' If channel queue (already positioned drives) is not empty then 

34- Select best transfer and start it 

35- Restore register # and drive # to RH 

36- Dismiss interrupt 

The correct cylinder is determined by dividing the sector number by 
the number of sectors per cylinder. To determine if a seek is needed, 
(2) the cylinder number is compared with the current cylinder number, 
which is remembered from the last transfer. (There are some limited 
conditions under which the drive will not be on the cylinder which is 
recorded in the software. In these cases, the implied seek of the 
drive will be used). The system can only start a seek if the drive is 
idle (for non-MASSBUS drives, both the drive and the controller must 
be idle). Therefore, if there is a data transfer in progress, the 
request is queued for the unit and will be started at a later time at 
interrupt level (4). If the drive is free, a seek will be started. 
If the drive is already on the correct cylinder, the seek logic is 
bypassed. If the drive and channel are not already busy, then t*- 



transfer is started; otherwise, the request is added to a queue for 
the required channel to be started at interrupt level at a later time. 
A transfer may range in size from a single word (128 words) to a whole 
cylinder; TOPS10 attempts to perform the longest possible transfer in 
order to maximize I/O throughtput. The system never attempts an 
implied seek in the middle of a transfer. Such a user request would 
be broken into two or more transfers with explicit intermediate seeks. 
Also, in order to simplify the code considerably, attention interrupts 
are disabled while doing a data transfer. 

When an interrupt occurs, the system may or may not have just 
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during the data transfer, there may be a number of outstanding 
attention conditions when the interrupt is actually honored. First, 
the system reads the attention summary register. Each drive (except 
the one which was completing a transfer) is checked for an attention 
bit on. If there is an attention bit on, and if there is a seek 
complete, the transfer request is added to the channel queue to be 
started for I/O. If there was no seek in progress, then the drive has 
just come on line or powered up (see later discussion for these 
conditions) . 

Once all outstanding seeks are processed, the data transfer completion 
is handled. If there was no error or after error correction (see 
error recovery later) , the channel termination word is compared to the 
predicted channel termination word. If the check fails, then error 
recovery is started. After completing the processing for the 
interrupt, any outstanding seeks are started. For each drive, the 
closest (shortest) seek is the one selected for startup (a fairness 
count will cause the system to select the oldest transfer every 'n'th 
time). After seeks are started, the channel queue is checked for 
positioned drives and the transfer with the shortest latency is 
started (again unless the fairness count says otherwise) . SWAPPER 
requests receive preference over file I/O (unless fairness count 
expires) . 

There is some special processing for interrupts on a MASSBUS device 
caused by the fact that the system may be attempting some operation 
using the device registers at UUO level at the time of the interrupt. 
When the interrupt occurs, the system reads RHxx and saves the drive 
and register number to which the RH was pointing. Before dismissing 
the interrupt, a DATAO is done to restore the drive number and 
register number. The need for reading the register from the RH at 
interrupt and restoring them before dismissing the interrupt is made 
worse by the fact that the system must wait 3 microseconds after the 
DATAO specifying what data is wanted before the DATAI can read the 
data. 

There are other special considerations with the front end disk unit. 
In general, both the front end and TOPS10 may attempt to use the disk 
at the same time. The most frequent conflict occurs at system startup 
when the front end is using the disk at the same time that TOPS10 is 
running INITIA on all lines (there is a count of the times that TOPS10 
tried to get the disk and found it busy; this normally rises quickly 
at system startup to about 40 and seldom changes thereafter. It is 
possible to do an assembly on the front end while timesharing 
continues on the -10 which might generate considerable conflict). 



When the -10 attempts to get the disk and finds that it is in use by 
the front end, the requests is delayed (with considerable trickery to 
upper level code) and restarted when the drive can be gotten. Since 
TOPS10 may complete a seek for the front end drive and have the front 
end grab the disk and move it before the data transfer is started, it 
is possible that the drive will not be on the correct cylinder when 
TOPS10 tries to start the transfer. In this case, implied seek will 
be used since TOPS10 will not realize that the disk has been moved. 
This would also happen if TOPS10 got two different requests for the 
same cylinder and would decide that no seek is necessary when in fact, 
the front end had moved the heads. 



2.0 DUAL PORT HANDLING 

The dual port handling is very simple. It occurs only when the system 
attempts a data transfer on one channel and finds it busy. It then 
tries the other port. At no other time is the dual port facility 
used. 

At system startup, the system reads the drive type and serial number 
from each drive on all channels. When the same serial number, drive 
type is found on two different channels, the disk is determined to be 
dual ported. One path (the first one found) is then the primary path 
and the second is the alternate path. When starting a transfer, the 
system will attempt to use the primary path. If that path is busy, it 
will then check for the alternate path; if that is available, the 
transfer is started. Otherwise, the request is placed in the channel 
queue for the primary channel. 



3.0 EXCEPTION CONDITIONS 

There are a number of possible error conditions that can occur while 
TOPS10 attempts to operate the disks. This section will attempt to 
list the error conditions, the circumstances under which they occur, 
and the action taken by the system. It will not attempt to show the 
'flow diagram' of the error handling in the normal code. In general, 
the error processing is called as soon as possible after the error is 
detected. 



3.1 Data Transfer Errors 

These errors are detected on the completion interrupt for a data 
transfer (either read or write) . These do not include the software 
detected error of the channel termination word not agreeing with the 
predicted channel termination word. 



3.1.1 ECC Correctable Error - When a transfer terminates with an ECC 

correctable error the transfer stops after the sector in error. The 

software will reconstruct the data and restart the transfer at the 
sector following the sector in error. 



3.1.2 Non-data Error - When a transfer completes that is not a data 

error {for example, a header error) the software will attempt to retry 

the transfer a number of times before recording the error as a hard 
error. The retry sequence is: 

Retry 10 times 

Recalibrate 

Seek 

Retry 10 times 

Recalibrate 

Seek 

Retry 10 times 

If after 30 tries the transfer still fails, the error is considered 
hard and an error is returned to the user. The data is recorded in 
SYSERR and a count of hard errors for this device is incremented. 

If the count of hard errors reaches a system default {not 25) , a 
message is given to the operator saying that there has been an 
excessive number of hard errors and the count is zeroed. The 
expectation is that the operator may want to set some hardware offline 
or call his field service rep to run a few diagnostics. 



3.1.3 Data Error - If a data error (including header compare error) 
occurs which is not ECC correctable, then the system will retry the 
transfer and will use the offset register to vary the head position on 
each side of the track centerline. The retry sequence is: 

Retry 16 times on centerline 

Offset head to +200 microinches 

Retry 2 times 

Offset head to -200 microinches 

Retry 2 times 

Offset head to +400 microinches 

Retry 2 times 

Offset head to -400 microinches 

Retry 2 times 

Offset head to +600 microinches 

Retry 2 times 

Offset head to -600 microinches 

Retry 2 times 

Return to centerline 

Retry disabling stop on error 

Retry (every retry except the next to last is done 

with stop on error enabled, this enables the recording 

of the maximum of information in SYSERR) . 

For an RP04 the offset distances are twice the above. If the transfer 
is recovered at the offset position, the drive is left positioned at 
offset. If the next transfer on that drive is for that cylinder, it 
is first to be attempted at the same offset. If that fails, the head 
is returned to centerline and the entire above sequence is tried. If 
any seek is done, the heads will be on centerline (including transfers 
which cause the head to return to the cylinder on which an error was 
recovered at offset) . If the device is not an RP04 or RP06 (MASSBUS 
drive), the error recovery is 10 retries of the sequence: read/write 
10 times, recalibrate, seek. 

On a hard non-recoverable error, an error is returned to the user and 
the system remembers the block number in error. When the file is 
subsequently closed, the system checks for a remembered block number. 
It starts "readina from the bad block number+1 until it finds a good 



sector {or 1000 sectors whichever is smaller). This gives the extent 
of the error region, which is then recorded in both the RIB of the 
file and the BAT block. If the program does not close the file after 
the error, but continues processing and hits a second error, the 
second error is lost. 



3.2 Seek and Status Errors 

In the attention summary register, on an interrupt, there may be 

attention interrupts for drives that were not transferring or seeking. 

Tn this r^aea^ th^ £?rlv^ i« nrt i nn t!"> r^H 1 *"*^ cr\ma e?,*^»-4- /-*£ c^a4<iic r.Ksrts*A 



such as coming on line or going down. 



3.2.1 Medium-on-line = - If medium-on-line is 0, the drive has just 
powered down. It is marked as such in the monitor tables. 



3.2.2 Drive Powered Up - If medium-on-line (MOL)=l and volume valid 
(W) =0 then the drive has just powered up. The monitor will read the 
home blocks and check that the pack is the expected pack on that 
drive. 



3.2.3 Seek Incomplete - On all seek errors, the error is counted and 
ignored. This will cause the data transfer to use the implied seek 
facility to perform the actual seek. If that implied seek fails, the 
data transfer will return an error and the appropriate retry secuence 
will be started(4.1.2) . 



3.2.4 Hung Device - Any time a seek or data transfer is started, the 
monitor starts an independent 'hung timer' that will fire in 7 seconds 
if the device has not responded with a completion interrupt for the 
operation. 

If the failing request was a seek, then it is retried. If it was a 
data transfer, the monitor does a CONO to clear BUSY and set DONE. 
After this, the appropriate retry sequence for a data error is 
started. If 8 hung retries in a row fail, then the monitor will set 
the drive offline and tell the operator that it is offline (message is 
Inconsistent Status for Drive x) . 



3.2.5 Rib Errors - Every RIB error detected (along with every 'n' 
hard errors) is reported to the operator. 



3.3 RAE Errors 

On an RH10, Register Access Error (RAE) is ignored. The hardware will 
set the Selected Drive RAE at which point error recovery is started. 



On the RH20, after every DATAO, a CONS2 on RAE is done, if there was 
an RAE, then it is cleared and the DATAO is retried. There is also a 
system count of RAE's per controller for the RH20's. 



4.0 BAT BLOCKS 

The BAT blocks provide a record of up to 63 errors on the disk. After 
each detected error {actually when the file is closed) , the monitor 
will uDdate the BAT blocks with the blocks in error and the type of 
error.' It is possible that the BAT blocks will be filled to 
overflowing and there will be no room for additional entries. The 
system will leave bad blocks marked as allocated in the SAT table and 
thus avoid reallocating them. SYSERR will also complain when there 
are less than 5 entries remaining in the BAT block. 

In aeneral . there are some pathological cases where the total damage 
to" 'a disk is unknown, but a reasonable PM of disks which includes 
checking SYSERR and DSKRAT and saving, refreshing (or replacing), and 
restoring packs with many bad spots will avoid difficulties. 



5.0 DSKRAT 

DSKRAT is a program which can be run to check for RIB errors and the 
disk space allocated as reported in the SAT table with the allocation 
as reported by the RIB's of the files on the pack. In general, it 
will find four kinds of errors: 

1. RIB errors - A RIB is not consistent in format with a valid 
RIB. Lost blocks - These are blocks which are marked as 
allocated in the SAT but are not a part of any file. 

2. Free blocks - These are blocks which are owned by some file 
on the system but are not marked as allocated in the SAT 
table. If one of these files is deleted, the system will get 
a BA2 STOPCD. 

3. Multiply Defined blocks - These are blocks which are marked 
as owned in two or more files. 

The safest procedure when a disk has significant problems in terms of 
free or multiply defined blocks is to BACKUP the pack, refresh it, and 
restore it. 
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DECsyStem-10 MONITOR INTERNALS COURSE 

LAB 1 



The goal of the student in this lab is to object patch the command 
written during week 1 and successfully load and run the monitor. 



PROCEDURE 

1. Copy LIB:SYSTEM,EXE to your own disk area as SYSTEM.EXE ■ 

2. Run FILDDT (,R FILDDT), in patch modeC/P) , to insert your patch 

into SYSTEM.EXE, 

■ 3. Enter your routine in the PAT area as described in the CRASH 
ANALYSIS handbook chapter 9, Be sure to redefine PAT as 
. described-, 

4, Overlay the COHTAB and DI5P entries for the CORE command with 
the sixbit command value, dispatch bits and dispatch address. 
The dispatch address will be the first location in your routine 
in the patch area, 

5, Update CONFIG and SYSOAT -to reflect a new version of the 
monitor, 

6, Terminate the patching session by entering a control Z to 
FILDDT. 

7, If you did- not define any new symbols or delete any existing 
symbols do" a filcom on the original SYSTEM.EXE verses the 
patched SYSTEM.EXE and justify each word that is different. 

8, Load and run the monitor verifying that the command works as 
expected. 



DECsystem-10 MONITOR INTERNALS CCURSF. 

FILDDT Lab session # 2 : examining the running monitor 

This lab session requires you to examine portions of the data base of 
the currently running monitor, particularly the JOB TABLES data base. 

To begin, simply type .R FDSYS, and when FILDDT asks "File:", you 
should respond with /M (crlf). Thereafter, regular DDT commands apply. 

Use FILDDT and the Monitor Table Descriptions to answer all the 
following questions, (note: answers should consist of the table name 
and word label plus the data) 

I. Job tables 

A," There is a job running under C75,33. Learn the following: 

1. What program is it running? What ppn is its high segment 
from? 



2, How much core is it running in? Where in core is it? 



3. What is its wait state code; 



B, Find the PDB for this job. 

1. How much run time has it accumulated? Kilo-core ticks? 



2, What is its mcu? Is it swappable? 



Decsystem-iO Monitor Internals Course 

Lab3 



FILDDT. Session To Begin Crash. Analysis 



The purpose of this lab description is to guide you 
through the preliminary steps in crash analysis. The 

further labs. Prior to completing the worksheet for 
this lab do the activities listed below. 

After completing the worksheet you should be able 
to describe what part of the monitor data base was 
sabotaged to produce this crash and list the correct 
data base value, 



1, Before you start FILDDT, it will be very helpful 
if you get a SYSTAT of the crash. Type the Monitor 
commandCs) needed to cause SYSTAT to examine the 
status of L1B:SER001,EXE and write the output in 
your disk area as file SYSTAT.TXT, 

2, Print SYSTAT.TXT on the hardcopy terminal in the 
lab and retrieve the resulting listing. Keep this 

. listing next to your terminal for further reference 
while you are using FILDDT. 

3, Run SYS; FILDDT. EXE, Make a monitor specific 
FILDDT by typing the FILDDT command(s) needed to 
load the symbols from LIB:SYSTEM.EXE. Type the 
monitor commandCs) needed to save the resulting 
monitor specific FILDDT in your disk area as 
FD.EXE, 

4, Run the FD.EXE you just created. Type the FILDDT 
commandCs) needed to examine SER001.EXE. 

5, Complete the crash analysis worksheet for this 
crash particular crash. note the crash dump 
worksheet supplement which explains how to obtain 
the data necessary to complete the worksheet. 



initial crash dump worksheet 

1. CRASH FILE.- — SERIAL #' PROCESSOR 

2. CRASH TIKE AND DATE—————— —.—————— »——>—— « 

•s ;jui* tuippooiiiit ixvrt.s WEBE IN PROGRESS! PISTS „„—.———— 

C 1 3 „ _, C 2 ) — — C 3 ) — C 4 ) — C 5 ) _— C 6 ) C 7 5 — — 

4. HARDWARE STATUS AT TIME OF CRASH 

UPTSTS — EPTSTS - APR STATUS , 

UPHP — EPMP CURRENT AC BLOCK _— 



5. WHAT CAUSED THE CRASH DUMP? 

STOP CODE NON-ZERO IN 30 407 RESTART— OTHER. 



6. IF THE 5T0P CODE WAS YOUR ANSWER TO 5, ANSWER THE FOLLOWING: 

STOPCODE NAME.——— STOPCODE MODULE. — 

STOPCODE DESCRIPTION . — — — — — — — — — 



STOPCODE TYPE? 

HALT— STOP JOB DEBUG. OTHER. 



DATA ITEM TESTED AND TEST CONDITIONS. 



EXPECTED VALUE ACTUAL VALUE— 

7. CURRENT JO&-! PPN PROGRAM 

8, WHAT CYCLE DETECTED OR EXPERIENCED THE ERROR? 

UUO MONITOR DEVICE INTERRUPT OTHER. 



9.- WHICH MAJOR PROCESS WITHIN 'THE CYCLE? 
IF UUO:' 

PRE-DISPATCH.™ COMMON I/O CODE SPECIFIC CODE POST DISPATC 

IF MONITOR: 

TIME ACCOUNTING TIDING REQUESTS HUNG CHECK™ REQUEUE. — 

SWAPPING— SCHEDULING - 

IF DEVICE .INTERRPUT: 

DEVICE STATUS . RETRY BUFFER CHECK 

DEVICE START/STOP-.——— DISMISS—™ 

10. LAST 10 STACK ENTRIES? 

VALUE ROUTINE 



TOP OF STACK 



11, ANALYSIS OF THE CAUSE OF THE CRASH. 



INITIAL CRASH DUMP WORKSHEET SUPPLEMENT 

THE WORKSHEET wAS DESIGNED FOR INSTRUCTIONAL USE IN ELEMENTARY CRASH ANALYSIS 
THE PURPOSE OF THE INITIAL CRASH DUKP WORKSHEET IS TO STRUCTURE THE 
DATA COLLECTION PROCEDURE ..NECESSARY TO ANALYZE THE CAUSE OF A SPECIFIC 
MONITOR CRASH, THE INFORMATION TO BE RECORDED ON THIS WORKSHEET IS 
JUST" A SMALL SUBSET OF ALL -THE INFORMATION AVAILABLE IN A CRASH DUMP. 

THE PURPOSE OF THIS SUPPLEMENT TO THE DUMP WORKSHEET IS TO EXPLAIN 
WHERE THE ITEMS IN THE WORKSHEET MAY BE FOUND IN A CRASH DUMP AS 
WELL AS HOW TO INTERPERT. THEIR CONTENTS. 

1), THE CRASH FILE NAME IS JUST THE NAME OF THE' CRASH DUMP, IE 
SER001.EXE 

THE PROCESSOR SERIAL NUMBER MAY BE FOUND IN .COASN 
FOR CPUO AND .C1ASN FOR CPU1. 

THE PROCESSOR TYPE MAY BE DETERMINED FROM THE SERIAL NUMBER ACCORDING 
TO "THE FOLLOWING SERIAL NUMBER UN DECIMAL) ASSIGNMENTS! 
KA < 513 
512 < KI < 1025 
1024 < KL < 4097 
4096 < KS 

. DEPENDING ON PROCESSOR TYPE FILDDT SHOULD BE SET UP TO 
MAP ADDRESSES* 

23 THE CRASH DATE AND TIME MAY BE FOUND IN LOCATIONS: 

LOCYER, LQCMON, LOCDAY, LOCHOR, LOCMIN, LOCSEC IN DECIMAL. 
THIS IS USEFUL FOR CORRELATION WITH OTHER EVENTS THAT OCCURED 
AT THE TIME OF THE CRASH, IE HARDWARE FAILURES ETC. 

3). FOR STOP CODE CRASHES PISTS WILL CONTAIN THE RESULTS OF A 
CONI PI. BITS 2l s 27 DESCRIBE THE INTERRUPT IN PROGRESS AS 
DESCRIBED. IN THE HARDWARE REFERENCE MANUAL SECTION 3.2. 

4). THE HARDWARE STATUS MAY BE FOUND AS FOLLOWS! 
UPMP-.UPTSTS KL BITS 23-35 

EUBSTS KI BITS 5-17 
EPMP- EPTSTS KL BITS 23-35 

EUBSTS KI BITS 23-35 

CURRENT AC BLOCK- UPTSTS KL BITS 6-8 

EUBSTS KI BITS 1-2 

APR STATUS - APRSTS 

INTERPERTATION OF THE BITS IN APRSTS INDICATE VARIOUS 
PROCESSOR ERRORS AS DESCRIBED IN THE HARDWARE REFERENCE . 
MANUAL. 

5). THE CAUSE OF THE DUMP CAN BE FOUND BY EXAMINING THE CONSOLE OR OPERATOR 
LOGS. 



6), THE STOP CODE. ITSELF CAN 8E FOUND IN.CRSWHY. THE MODULE CONTAINING THE 
. STOP CODE MAY BE FOUND BY TYPING -'S . .XXX? ' WHERE XXX IS THE STOP CODE, 
■ LOOK AT STGPCD.MEH IN THE SOFTWARE NOTEBOOKS OR THE CODE IN THE SOURCE 
LISTINGS FOR THE DESCRIPTION OF THE STOP CODE INCLUDING 
THE STOP CODE TYPE. 

DESCRIBE THE CONDITION THAT CAUSED THE STOP 

CODE IE THE SPECIFIC CONDTIONAL TEST MADE INCLUDING THE DATA 

EXPECTED AND ACTUALLY FOUND. 

EXAMINE THE DATA BASE USED TO MAKE THE DECISION TO CRASH THE 
MONITOR. DETERMINE WHETHER ITS VALUE IN CORE OR IN AN AC IS 
CORRECT VIA EXAMINING AN UNRUN MONITOR OR MONITOR LISTINGS, 

7). THE CURRRENT JOB NUMBER IS STORED IN CURJOB AND .COJOB, THIS ' 
IS USEFUL FOR SETTING UP PAGING FOR THE PROPER UPMP. 

8). SYMBOLIC INTERPERTATION OF THE CONTENTS OF P YIELDS INFORMATION 
ABOUT WHAT THE MONITOR WAS DOING WHEN THE ERROR WAS DETECTED, 

P PROCESS 



NULPDL 
370510 



C'N'PDl 
ERRPDL 



USED BY THE MONITOR CYCLE 

UUO LEVEL PUSH DOWN STACK, THIS RESIDES 

IN THE CURRENT JOBS UPMP SO SET UP PAGING 

PRIOR TO REFERENCING THE STACK ITSELF, 

CHANNEL 'N' PUSHDOWN STACK 

USED BY THE DIE ROUTINE 



9), CAN BE DETERMINED BY EXAMINING THE CODE AND 

• CORRELATING THE PC TO THE FLOW CHARTS USED IN THE MONITOR 
INTERNALS COURSE', 

10). NOTE THE CONTENTS OF THE STACK TO TRACE THE HISTORY OF THE 
EVENTS LEADING TO THE CRASH, 



11). NOTE THE. ACTUAL CAUSE OF THE CRASH AFTER ANALYZING 

ALL THE INFORMATION COLLECTED UP TO THIS POINT. THIS ANALYSIS 
MIGHT DETERMINE THE EXACT CAUSE AND BUG FIX OR JUST SPECULATION 
AS TO WHAT ADDTIONAL INFORMATION NEED SE KNOWN TO COME TO A FINAL 
CONCLUSION. 



Decsystem-10 Monitor Internals Course 

Lab 4 



Using the crash analysis worxsheet as a guide analyze 
SER002.EXE as to why it crashed. You should be able to 
find the offending instruction. This is a -701 1091 
crash. Use the same monitor specific FILODT that was 
made for Lab3, remember that this crash dump was 
obtained by poking the monitor therefore once you find 
the word in error your analysis is complete. 



1. CRAS 



INITIAL CRASH DUMP WORKSHEET 



H FIL £ ..SERIAL f .....PROCESSOR 



2 e CRASH TIME AND DATE—— . 



;.:U-*m T>.ll|>C>Q»IIOK T.ffVITI.S wcaR TM PROGRESS? PISTS ... 



CI) 



. C 2 } ..... C 3 ) ..... C 4 ) C 5 ) .... . C 6 ) .... . C 7 ) . 



4. HARDWARE STATUS AT TIME OF CRASH 

UPTSTS - EPTSTS APR STATUS 

UPM p ■ mmmm . i EPMP CURRENT AC BLOCK 



5. WHAT CAUSED THE CRASH DUMP? 

STOP CODE NON-ZERO IN 30 407 RESTART OTHER. 



6. IF THE STOP CODE WAS YOUR ANSWER TO 5, ANSWER THE FOLLOWING: 

STOPCODE NAME STOPCQDE MODULE % 

STOPCODE DESCRIPTION ...... — — — — 



STOPCODE TYPE? 

HALT— STOP - JOB DEBUG OTHER. 



DATA ITEM TESTED AND TEST CONDITIONS. 



EXPECTED VALUE ,- ■ ACTUAL VALUE- 

7, CURRENT JOB... - PPN PROGRAM 

8, WHAT CYCLE DETECTED OR EXPERIENCED THE ERROR? 

UUO MONITOR DEVICE INTERRUPT- OTHER. 



WHICH HAJOR PROCESS' WITHIN THE CYCLE? 
IF UU0I • 

PRE-PISPATCH -- COMMON I/O CODE SPECIFIC CODE—.- POST DISPATC 

IF MONITOR: 

TIME ACCOUNTING TIMING REQUESTS HUNG CHECK - REQUEUE 

SWAPPING™ SCHEDULING.-. 

IF DEVICE INTERRPUTS 

DEVICE-- — STATUS- RETRY - BUFFER CHECK 

DEVICE START/STOP—— - DISMISS — 



10. LAST 10 STACK ENTRIE5S 

VALUE ROUTINE 



TOP OF STACK 



11, ANALYSIS OF THE CAUSE OF THE CRASH, 



INITIAL CRASH DUMP WORKSHEET SUPPLEMENT 

THE WORKSHEET *AS DESIGNED FOR INSTRUCTIONAL USE 1-N- ELEMENTARY CRASH ANALYSIS 
THE PURPOSE OF THE INITIAL CRASH DUMP WORKSHEET IS TO STRUCTURE THE 
DATA COLLECTION PROCEDURE NECESSARY TO ANALYZE THE CAUSE OF A SPECIFIC 
MONITOR CRASH. THE INFORMATION TO BE RECORDED ON THIS WORKSHEET IS 
JUST A SMALL SUBSET OF ALL THE INFORMATION AVAILABLE IN A CRASH DUMP. 

THE PURPOSE' OF THIS SUPPLEMENT TO THE DUMP WORKSHEET IS TO EXPLAIN- 
WHERE THE ITEMS IN THE WORKSHEET MAY BE FOUND IN A CRASH DUMP AS 
4K.ijU AS nOw 'l'U iWltKrcwi inun \.utnt.mu, 

1). THE CRASH FILE NAME IS JUST THE NAME OF THE CRASH DUMP, IE 
SEROQl.EXE 

THE PROCESSOR SERIAL NUMBER MAY BE FOUND IN .COASN 
FOR CPUO AND .C1ASN FOR CPU1, 

THE PROCESSOR TYPE MAY BE DETERMINED FROM THE SERIAL NUMBER ACCORDING 
TO THE FOLLOWING SERIAL NUMBER CIN DECIMAL) ASSIGNMENTS: 
KA < 513 
512 < KI < 1025 
1024 < KL < 4097 
4096 < KS 

DEPENDING ON PROCESSOR TYPE FILDDT SHOULD BE SET UP TO 
MAP ADDRESSES. 

2} THE CRASH DATE AND TIME MAY BE FOUND IN LOCATIONS: 

LOCYER, LOCMON, LOCDAY, LOCHOR, LOCMIN, LOCSEC IN DECIMAL. 
THIS- 15 USEFUL. FOR CORRELATION WITH OTHER "EVENTS THAT OCCURED 
AT THE TIME OF THE CRASH, IE HARDWARE FAILURES ETC. 

3). FOR STOP CODE CRASHES PISTS WILL CONTAIN THE RESULTS OF A 
CONI PI. BITS 21-27 DESCRIBE THE INTERRUPT IN PROGRESS AS 
DESCRIBED IN THE HARDWARE REFERENCE MANUAL SECTION 3.2. 

4), THE HARDWARE STATUS MAY BE FOUND AS FOLLOWS: 

UPMP- UPTSTS KL BITS 23-35 

EUBSTS KI BITS 5-17 

EPMP- EPTSTS KL BITS 23-35 

EUBSTS KI BITS 23-35 

CURRENT AC BLOCK- UPTSTS KL BITS 6-8 

EUBSTS KI BITS 1-2 

APR STATUS - APRSTS 

INTERPERTATION OF THE BITS IN APRSTS INDICATE VARIOUS 
PROCESSOR ERRORS AS DESCRIBED IN THE HARDWARE REFERENCE . 
MANUAL. 

5). THE CAUSE OF THE DUMP CAN BE FOUND BY EXAMINING THE CONSOLE OR OPERATOR 
LOGS. 



6) 



THE STOP CODE. ITSELF CAN BE FOUND I'N CRSWHY. THE MODULE CONTAINING THE 
STOP CODE ' *AY BE FOUND BY TYPING 'S..XXX?' WHERE XXX IS THE STOP CODE, 
LOOK AT 5T0PCD.MEH Hi THE SOFTWARE NOTEBOOKS QR THE CODE IN THE SOURCE 
LISTINGS FOR. THE DESCRIPTION OF" THE STOP CODE INCLUDING 
THE STOP CODE TYPE. 



DESCRIBE THE CONDITION THAT CAUSED THE STOP 

CODE IE THE SPECIFIC CONDTIQNAL TEST HADE INCLUDING THE DATA 

EXPECTED AND ACTUALLY FOUND, 

EXAMINE THE DATA BASE USED TO MAKE THE DECISION TO CRASH THE 
MONITOR. DETERMINE WHETHER ITS VALUE IN CORE OR IN AN AC IS 
CORRECT VIA EXAMINING AN UNRUN MONITOR OR MONITOR LISTINGS. 

7), THE CURRRENT JOB NUMBER IS STORED IN CURJOB AND .COJOB, THIS ' 
IS USEFUL FOR SETTING UP PAGING FOR THE PROPER UPMP. 

8), SYMBOLIC INTERPERTATION OF THE CONTENTS OF P YIELDS INFORMATION 
ABOUT WHAT THE MONITOR WAS DOING WHEN THE ERROR WAS DETECTED. 



NULPDL 
37C510 



C'N'PDl 
ERRPDL 



PROCESS 

"uSED~By"tHE MONITOR CYCLE 
UUO LEVEL PUSH DOWN STACK. THIS RESIDES 
IN THE CURRENT JOBS UPMP SO SET UP PAGING 
PRIOR TO REFERENCING THE STACK ITSELF. 
CHANNEL 'N' PUSHDOWN STACK 
USED BY THE DIE ROUTINE 



9), CAN -BE DETERMINED BY EXAMINING THE CODE AND 

CORRELATING THE PC TO THE FLOW CHARTS USED IN THE MONITOR 
INTERNALS COURSE. 

10). NOTE THE CONTENTS OF THE STACK TO TRACE THE HISTORY OF THE 
EVENTS LEADING TO THE CRASH, 



11). NOTE THE ACTUAL CAUSE OF THE CRASH AFTER ANALYZING 

ALL THE INFORMATION COLLECTED UP TO THIS POINT. THIS ANALYSIS 
MIGHT DETERMINE THE EXACT CAUSE AND BUG FIX OR JUST SPECULATION 
AS TO wHAT ADDTIONAL INFORMATION NEED BE KNOWN TO COME TO A FINAL 
CONCLUSION. 



Decsystem-10 Monitor Internals Course 

Lab 5 



Use the crash analysis worksheet to help in analyzing 
SER003.EXE. This crash was obtained by exercising a 
bug. You should be able to determine why the machine 
crashed and after studying what function was being 
performed by the monitor you should be able to outline 
a general cure for the problem. 



INITIAL CRASH DUMP WORKSHEET 

1. CRASH FILE — SERIAL *— PROCESSOR.. 

2. CRASH TIME' AND DATE— _™ — - — — — — ■— — —— — — 

.«..,«»« •.-cne- TV! BDnfiorSS! PTSTS ..._■•■■■■< 

3. WHAT INTEKRUfi' UC.vs.Ui9 n&i\c. jt« «-««■»€»—.«■. - > . 

( l ) . ( 2 ) — C 3 ) C 4 ) C 5 ) — — C 6 ) — — C 7 ) • 

4. HARDWARE STATUS AT TIME OF CRASH 

UPTSTS - EPTSTS APR STATUS. 

upH p EPMP. , CURRENT AC BLOCK . 



5. WHAT CAUSED THE CRASH DUMP? OB . e ,, ,,. hthfr 
STOP CODE NON-ZERO IN 30 407 RESTART— OTHER. 



6. IF THE STOP CODE WAS YOUR ANSWER TO 5, ANSWER THE FOLLOWING: 
STOPCODE NAME STOPCODE MODULE ,' 

STOPCODE DESCRIPTION — — — — — — — — ' — ; — 



STOPCODE TYPE? ™ uc . n 

HALT..— STOP— JOB— DEBUG „„ OTHER. 



DATA ITEM TESTED AND TEST CONDITIONS. 



EXPECTED VALUE ACTUAL VALUE- 

7, CURRENT JOB...—- PPN — PROGRAJ*_<=..=».»._-r 

8, *HAT CYCLE DETECTED OR EXPERIENCED THE ERROR? 
UUO_. . MONITOR— DEVICE INTERRUPT.— OTHER. 



9.' WHICH MAJOR PROCESS WITHIN THE CYCLE? 
IF UUD:' • • 

PRE-DISPATCH-... COMMON I/O CODE — SPECIFIC CODE...- POST DISPATC 

IF SiONITUR: 

TIME ACCOUNTING . TIDING REQUESTS . HUNG CHECK... REQUEUE... 

SWAPPING - SCHEDULING 

IF DEVICE INTERRPUT: 

DEVICE....... STATUS....... RETRY...... BUFFER CHECK.. . 

DEVICE START/STOP...—.— DISMISS..... 

10. LAST 10 STACK ENTRIES: 

VALUE ROUTINE 



TOP OF" STACK 



11, ANALYSIS OF THE CAUSE OF THE CRASH, 



.INITIAL CRASH DUMP WORKSHEET SUPPLEMENT 

THE WORKSHEET wAS DESIGNED FOR INSTRUCTIONAL USE IN ELEMENTARY CRASH ANALYSIS 
THE PURPOSE OF THE INITIAL CRASH DUMP WORKSHEET IS TO STRUCTURE THE 
DATA COLLECTION PROCEDURE ..NECESSARY TO ANALYZE THE CAUSE OF A SPECIFIC 
MONITOR CRASH. THE INFORMATION TO BE RECORDED ON THIS WORKSHEET IS 
JUST' A SMALL SUBSET OF ALL THE INFORMATION AVAILABLE IN A CRASH DUMP, 

THE PURPOSE OF THIS SUPPLEMENT TO THE DUMP WORKSHEET IS TO EXPLAIN- 
WHERE THE ITEMS IN THE WORKSHEET MAY BE FOUND IN A CRASH DUMP AS 

itLUU tttf awn * u jb»4bnrj^i\4. Ai*^*^ >»««***«»**•'« 

1), THE CRASH FILE NAME IS JUST THE NAME OF THE CRASH DUMP, IE 
SER001.EXE 

THE PROCESSOR SERIAL NUMBER MAY BE FOUND IN .COASN 
FOR CPUO AND .C1ASN FOR CPU1, 

THE PROCESSOR TYPE MAY BE DETERMINED FROM THE SERIAL NUMBER ACCORDING 
TO THE FOLLOWING SERIAL NUMBER CIN DECIMAL) ASSIGNMENTS: 
KA < 513 
512 < KI < 1025 
1024 < KL < 4097 
4096 < KS 

DEPENDING ON PROCESSOR TYPE FILDDT SHOULD BE SET UP TO 
MAP ADDRESSES, 

2) THE CRASH DATE AND TIME MAY BE FOUND IN LOCATIONS: . 

LOCYER, LOCMON, LOCDAY, LOCHOR, LOCMIN, LOCSEC IN DECIMAL. 
THIS IS USEFUL FOR CORRELATION WITH OTHER EVENTS THAT OCCURED 
AT THE TIME OF THE CRASH, -IE HARDWARE FAILURES ETC, 

3). FOR STOP CODE CRASHES PISTS WILL CONTAIN THE RESULTS OF A 
CONI PI. BITS 21=27 DESCRIBE THE INTERRUPT IN PROGRESS AS 
DESCRIBED. IN THE HARDWARE REFERENCE MANUAL SECTION 3.2. 

4). THE HARDWARE STATUS MAY BE FOUND AS FOLLOWS: 
UPMP-.UPTSTS KL BITS 23-35 

EUBSTS KI BITS 5-17 
EPMp- EPTSTS KL :, BITS 23-35 

EUBSTS KI BITS 23-3S 

CURRENT AC BLOCK- UPTSTS KL BITS 6-8 

EUBSTS KI BITS 1-2 

APR STATUS - APRSTS 

INTERPERTATION OF THE BITS IN APRSTS INDICATE VARIOUS 
PROCESSOR ERRORS AS DESCRIBED IN THE HARDWARE REFERENCE . 
MANUAL. 

5). THE CAUSE OF THE DUMP CAN BE FOUND BY EXAMINING THE CONSOLE OR ' OPERATOR 
LOGS. 



6). THE STOP CODE- ITSELF CAN BE FOU^D I* . CRS&HY . THE MODULE CONTAINING THE 
. STOP CODE WAY BE FOUND BY TYPING -'S... XXX?* WHERE XXX IS THE STOP CODE, 
LOOK AT STDPCD.KEM IN THE SOFTWARE NOTEBOOKS OR THE CODE IN THE SOURCE 
LISTI*G5 FOR THE DESCRIPTION OF THE STOP CODE INCLUDING 
THE STOP CODE TYPE. 

DESCRIBE THE CONDITION THAT CAUSED THE STOP 

CODE IE THE SPECIFIC CONDTIONAL TEST MADE INCLUDING THE DATA 
'EXPECTED AND ACTUALLY FOUND. 

EXAMINE THE DATA BASE USED TO MAKE THE DECISION TO CRASH THE 
MONITOR. DETERMINE WHETHER ITS VALUE IN CORE OR IN AN AC IS 
CORRECT VIA EXAMINING AN UNRUN MONITOR OR MONITOR LISTINGS.. 

7). THE CURRRENT JOB NUMBER IS STORED IN CURJOB AND .COJOB, THIS • 
IS USEFUL FOR SETTING UP PAGING FOR THE PROPER UPMP. 

8)., SYMBOLIC INTERPERTATION OF THE CONTENTS OF P YIELDS INFORMATION 
ABOUT WHAT THE MONITOR WAS DOING WHEN THE ERROR WAS DETECTED; 

P • PROCESS 



NULPDL USED BY THE MONITOR CYCLE 

370510 UUO LEVEL PUSH DOWN STACK. THIS RESIDES 

IN THE CURRENT JOBS UPMP SO SET UP PAGING 
PRIOR TO REFERENCING THE STACK ITSELF. 
C'N'PDl CHANNEL *ti' PUSHDOWN STACK : 

ERRPDL USED BY THE DIE ROUTINE 

9), CAN BE DETERMINED BY EXAMINING THE CODE' AND 

CORRELATING THE PC TO THE FLOW CHARTS USED IN THE MONITOR. . : 
INTERNALS COURSE. 

10). NOTE THE CONTENTS OF THE STACK TO TRACE THE HISTORY OF THE 
EVENTS LEADING TO THE CRASH. -• " 

11). NOTE THE. ACTUAL CAUSE OF THE CRASH AFTER ANALYZING ' 

ALL THE INFORMATION COLLECTED UP TO THIS POINT. rTHTS ANALYSIS 
MIGHT DETERMINE THE EXACT CAUSE. AND BUG FIX OR JUST SPECULATION 
AS TO wHAT ADDTIONAL INFORMATION NEED BE KNO^N TO COME TO A FINAL 
CONCLUSION. 



UECsystem-10 MONITOR INTERNALS COURSE 

F1LDDT Lao session. * 6 : examining the running monitor 

This' lab session requires you to examine portions of the data base of 
tne currently running monitor, particularly the FILSER data base. 

To begin, simply type ,R FDSYS, and when FILDDT asks "File:", the 
-I...*--* -*.„,,ij ~«.-«-«>j .us /« Theroaffor. reaular DDT commands apply* 

a kUgCil V JUVU4U J. w«» fV»M "■ * i,»i r 4-tw »,,— .— — — , - - .» 

use FILDDT and the Monitor Table Descriptions to answer all the 
following questions, (note: answers should consist of the table name 
and word label plus the data) 

I, FILSER data base 

A. The [75,33 job is reading or writing a file. 

1, Find the PPB and follow the NMB and UFB linkages from it. 

• 2. Find the UFB. What is the disfc address of the UFD? 

3, Find the NMB and from it find the access table for an active 
file. CNote: You may encounter many NMB'S but only one will be 
active. Inactive nmb'S usually dont point to ACC blocks, they 
point back on themselves.) 

4. What does the access table think is being done to the file? 

3. FIND THE DDB. (That can be tough if the JDA is not in core.) 

HINT: see PDB,(,PDDVL word) program is using a logical name 
for Disk. . . ♦ 

" 1, What mode is being used to read or write the file? 

2. What is this DDB's logical device name ? 

■ 3. What relative block number is being accessed ? 

4. Describe the disk allocation of the file from its group 
pointerCs), 
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II. Disk file structures, storage allocation, etc. 
A, How many file structures are on the system? 

i. Their names? 



2,. Number of units in each structure and physical unit name of 
each? 



' 3, Describe the active swapping list. How much swap space on 
each unit? How much is free on each unit? 



B, SAT bl-ocfcs « . ' . : .,. 

■ 1» How many total SAT blocks for dsieb:? How many in core? 



2. How much space left in each SAT block? 



